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Abstract 


We  have  completed  our  research  on  a  transparent  optical  multistage  interconnection  network 
(MIN)  for  interfacing  and  distribution  of  parallel  access  optical  memories.  This  network  allows 
for  the  transparent  transmission  of  optical  data,  which  is  required  for  many  applications  such  as 
high  speed  image  computing,  data  base  search,  digital  library,  and  telemedicine.  The  transparent 
switching  fabric  of  this  network  is  based  upon  the  unique  technology  of  birefringent  computer 
generated  holograms  (BCGH).  BCGH  technology,  network  design  and  initial  system  architecture 
studies  were  first  developed  under  contract  #F30602-91-C-0094,  where  we  showed  the  viability 
and  capabilities  of  BCGH  based  switching  elements. 

During  this  contract  we  have  designed  and  implemented  a  folded  free-space  optical  MIN 
using  a  novel  folded  dilated  bypass-exchange  switch  (DBS)  built  using  BCGH  and  polarization 
rotator  elements.  The  DBS  allow  for  the  elimination  of  first-order  cross-talk  due  to  inaccuracies 
of  polarization  rotation  and  diffractive  element  fabrication  errors.  By  utilizing  the  three 
dimensional  functionality  of  the  optical  elements,  the  DBS  elements  can  be  stacked  in  the 
vertical  dimension  by  folding  the  switch  along  its  central  line  of  symmetry.  The  interconnection 
between  multiple  DBS  is  also  folded,  forming  a  compact  transparent  optical  MIN  package.  This 
compact  optical  system  design  is  easily  aligned  and  the  use  of  space-variant  lenslets  permits 
implementation  of  arbitrary  network  architectures  and  interconnection  patterns.  Additionally,  the 
use  of  patterned  micro-mirrors  to  fold  the  DBS  allows  for  spatial  filtering  of  the  undesired  high 
diffraction  orders,  thereby  decreasing  the  cross-talk. 

Fabrication  and  characterization  of  the  folded  transparent  optical  MIN  demonstration  system 
has  highlighted  the  advantages  of  this  design.  Using  BCGH  elements  with  measured  signal-to- 
noise  ratios  (SNR)  of  30:1,  the  fabricated  folded  2x2  DBS  improved  SNR  to  60:1.  Increasing  the 
scaling  to  a  4x4  MIN  resulted  in  SNR  of  120:1.  Therefore,  using  the  folded  MIN  design,  system 
scaling  is  limited  only  by  insertion  losses,  which  may  be  mitigated  with  the  use  of  optical 
amplifiers.  To  handle  the  path  contentions  that  may  arise  in  these  types  of  networks  we  have 
developed  a  novel  switching  protocol  named  the  gated-hold  protocol.  A  stochastic  model  has 
also  been  developed  that  allows  for  the  rigorous  performance  analysis  of  these  algorithms. 

We  have  also  continued  developing  BCGH  technology  by  using  a  highly  birefringent 
substrate  material  allowing  fabrication  of  single  substrate  polarization  selective  elements 
designed  with  a  novel  multiple  order  delay  (MOD)  approach.  Additionally,  form  birefringent 
computer  generated  holograms  (FBCGH)  have  been  developed.  The  FBCGH  use  subwavelength 
gratings  to  generate  birefringence  several  times  greater  than  possible  using  natural  anisotropic 
materials,  demonstrating  SNR  greater  than  250:1.  To  meet  the  switching  speed  requirements  of 
the  next  generation  MIN  systems,  we  have  also  conducted  extensive  modeling  and  testing  of 
PLZT  based  phase  modulation  devices  for  the  implementation  of  high-speed  polarization  rotation 
devices.  Using  readily  available  MOSFET  drivers,  we  have  developed  PLZT  based  devices 
working  at  10  MHz,  i.e.  three  orders  of  magnitude  faster  reconfiguration  time  of  optical  MIN 
compared  to  a  system  using  current  ferroelectric  liquid  crystal  technology. 
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1.  Overview 


This  final  report  outlines  the  past  two-year’s  research  on  development  of  a  compact  folded 
optical  multistage  interconnection  network  (MIN)  for  parallel  access  and  distributed  optical 
memory.  For  convenience,  publications  [1-12]  and  schematics  are  included  as  appendices  to  this 
report. 

Electronic  multiplexing  can  be  used  to  access  distributed  memory  devices  (optical  and 
magnetic  disk  arrays,  multi-head  disks,  etc.),  providing  data  rates  at  1  to  10  GHz.  Existing 
memories  are  commonly  arranged  in  a  distributed  environment,  where  the  use  of  optical  fiber 
transmission  lines  becomes  the  most  effective  medium.  The  interfaces  to  such  memory  systems 
would  benefit  from  transparent  optical  switching  fabric  that  allow  the  transmission  of 
information,  in  any  data  format,  at  rates  limited  only  by  the  speed  characteristics  of  the  electronic 
devices  at  the  transmitter  (e.g.  memory  system  node)  and  the  receiver  (e.g.  user  node). 

The  recent  advances  in  optical  amplifiers  have  increased  interest  in  such  transparent  optical 
networks.  Additionally,  since  polarization  compensation  in  single  mode  fiber  [13]  allows 
automatic  and  stable  control  of  the  polarization  states  of  transmitted  optical  signals,  it  may 
enable  utilization  of  polarization  dependent  all-optical  switching  fabric.  Such  polarization 
switching  has  been  proposed  for  ‘free-space’  MIN  for  switching  and  multiprocessor 
interconnections  [14-16].  Leveraging  off  of  birefringent  computer  generated  hologram  (BCGH) 
research  done  under  Contract  #RADCF-30602-91-c-0094,  we  have  recently  demonstrated  a  4x4 
optical  MIN  (see  Appendix  1).  Interconnection  controls,  which  necessarily  operate  at  much 
slower  rates,  can  be  executed  electronically.  Such  a  transparent  network  can  accommodate 
bandwidths  several  orders  of  magnitude  greater  than  possible  with  electronic  systems.  This 
transmission  bandwidth  capability  is  comparable  with  parallel  access  optical  memories  that  are 
expected  to  provide  aggregate  bandwidths  approaching  Tbits/s. 

A  block  diagram  of  such  a  memory  distribution  system  is  shown  schematically  in  Figure  1. 
The  control  of  the  switching  fabric  and  its  input  and  output  nodes  can  be  performed  using 
existing  electronic  network  technology  (e.g.  a  transparent  optical  MIN  may  work  in  parallel  to 
existing  electronic  MIN,  where  the  electronic  network  handles  the  transmission  of  low  priority 
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data  and  control  of  the  high-speed  optical  network).  The  transparent  optical  switching  fabric  is 
implemented  with  polarization  selective  diffractive  optics  and  polarization  rotator  arrays  being 
developed  at  UCSD.  The  primary  goal  of  this  project  is  to  use  these  technologies  to  implement  a 
compact  and  scalable  high-speed  interconnection  system  for  optical  memory  access. 


Figure  1.  Transparent  Optical  Multistage  Interconnection  Network  can  overlay  and  is 
controlled  by  an  existing  electronic  network  made  up  of  various  memory  device  and  user 
nodes  and  a  server. 

During  the  course  of  this  contract  we  have  focused  our  research  on  the  creation  of  a 
demonstration  compact  optical  MIN  as  well  as  the  development  of  the  novel  required  for  its 
operation.  In  particular  we  have  made  advances  in  four  areas:  optical  MIN  system  design, 
switching  network  protocol,  and  developing  the  technologies  of  polarization  selective  diffractive 
optical  elements  (DOE)  and  high-speed  polarization  rotation  device  arrays. 

One  of  the  novel  systems  that  we  have  developed  is  the  folded  dilated  bypass-exchange 
switching  (DBS)  fabric.  This  unique  folded  design  allows  us  to  take  advantage  of  the  symmetries 
of  the  optical  switching  system.  The  folded  architecture  places  all  like  elements  into  single  2D 
arrays,  thus  reducing  the  system  space  requirements  as  well  as  simplifying  system  alignment.  The 


2 


system  demonstration  has  been  fully  designed,  fabricated  and  characterized.  The  demonstration 
system  utilizes  our  current  accomplishments  in  developing  the  underlying  technologies.  In 
further  developing  BCGH  we  have  made  significant  improvements  in  the  concept  developed 
under  contract  #RADCF-30602-91-c-0094.  We  have  designed,  fabricated  and  characterized  a 
single  substrate  BCGH  that  resolves  many  of  the  fabrication  issues  that  were  seen  in  two 
substrate  BCGH.  We  have  also  developed  rigorous  design  tools  to  investigate  design  and 
fabrication  tolerances  for  polarization  selective  DOE.  Furthermore,  we  have  designed  and 
fabricated  single  substrate  form  birefringent  polarization  selective  DOE.  Finally,  we  are 
modeling,  fabricating  and  characterizing  a  high-speed  array  of  polarization  rotation  devices  that 
will  provide  several  orders  of  magnitude  improvement  in  speed  over  current  polarization  rotation 
technology. 

In  this  report  we  will  summarize  our  progress  on  each  of  the  four  major  areas  of  this  research 
that  were  described  above  and  also  present  results  of  the  system  demonstration:  The  following 
section  will  describe  the  basic  concepts  and  development  that  has  gone  into  the  system  design.  In 
Section  3  we  describe  the  switching  protocol  strategies  that  have  been  developed  for  this  project. 
Section  4  will  discuss  the  work  done  on  polarization  selective  computer  generated  holograms.  In 
Section  5  we  describe  the  continued  work  on  polarization  rotation  devices.  Section  6  describes 
the  fabrication  and  characterization  of  the  two  system  demonstrations.  In  Section  7  we  conclude 
this  report  with  an  assessment  of  the  work  that  has  been  performed  under  this  contract. 

2.  System  Design  of  the  Interconnection  Network 

The  interconnection  network  module  that  has  been  designed  and  constructed  enables  dynamic 
switching  between  input  and  output  optical  signals.  The  system  is  transparent  to  the  optical 
signals,  i.e.  the  signals  are  not  converted  to  the  electrical  domain  for  digital  switching  and  are  not 
remodulated  on  an  optical  carrier.  Due  to  this  transparent  feature,  the  system  is  independent  of 
the  bit-rate,  or  bandwidth,  of  the  signals.  The  switching  is  performed  by  an  applied  electrical 
signal,  coming  from  an  auxiliary  module,  running  the  protocol  and  routing  algorithms  (see 
Section  3). 
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The  interconnection  network  module  is  based  on  several  enabling  technologies  for 
controlling  the  phase  front  and  polarization  of  optical  beams.  The  first  is  computer  generated 
holograms  (CGH),  which  provide  general  optical  element  functionality  independent  of  the 
polarization  of  the  light.  The  second  enabling  technology  which  we  are  taking  advantage  of  is 
polarization  selective  computer  generated  holograms.  Similar  in  concept  to  the  CGH,  two  general 
functionalities  can  be  encoded  onto  it,  one  for  each  polarization  state.  Under  contract  #RADCF- 
30602-9  l-c-0094  we  succeeded  in  designing,  fabricating  and  experimentally  evaluating  these 
elements  by  constructing  special  computer  generated  holograms  in  birefringent  material  (BCGH). 
The  third  technology  is  polarization  modulators,  that  can  switch  between  polarization  states,  by 
an  applied  electric  field.  By  integrating  these  technologies  we  have  formed  the  basis  for  a 
transparent  optical  switch,  whose  precise  functionality  is  dictated  by  the  choice  and  placement  of 
these  elements. 


BCGHl  BCGH2 


output  1 


output  2 


Polarization 

rotator 


Figure  2.  Bypass-exchange  switch  based  using  two  BCGH  and  one  polarization  rotation 
element.  Input  beams  are  orthogonally  polarized. 

In  the  past  we  have  demonstrated  the  bypass-exchange  switch,  a  fundamental  switch  for  two 
signals  [17].  Two  BCGH  elements  and  an  electrooptic  (EO)  polarization  rotator  can  be  used  to 
construct  a  2x2  optical  BES  (see  Figure  2).  The  first  BCGH  element  combines  and  focuses  two 
inputs  into  the  polarization  rotator,  which  either  exchanges  their  polarizations  or  not.  The  second 
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BCGH  separates  and  directs  the  outputs  to  different  destinations.  Inaccuracies  of  the  polarization 
rotator  can  result  in  cross-talk  in  this  implementation  of  the  BBS.  The  polarization  rotator  can  be 
characterized  by  an  associated  error  of  S  in  the  rotation  angle,  which  results  in  a  cross-talk  term 
proportional  to  sin(|<5|) .  The  BCGH  elements  can  be  described  by  an  associated  cross-talk,  e,  due 

to  fabrication  errors  such  as  etch  depth  and  mask  alignment.  The  combined  cross-talk  component 
at  the  output  of  the  BBS  is  proportional  to  |<5|-i-|£|,  assuming  S,e«l.  The  signal-to-noise  ratio 
(SNR)  of  a  MIN  can  be  described  by 

SNR  =  log,o(^)-logioS  (1) 

where,  =|^-i-|£|  and  5  is  the  number  of  interconnection  stages  [1].  For  scalability  of  MIN 
network  size  (i.e.  5  is  growing)  the  cross-talk,  S^.,  of  each  stage  must  be  reduced  to  achieve 
necessary  SNR. 

The  dilated  bypass-exchange  switch  (DBS),  which  utilizes  a  more  complex  structure, 
performs  the  functionality  of  the  BBS  with  improved  cross-talk  performance  [18].  The  DBS, 
which  has  two  input  and  two  output  signals,  is  comprised  of  four  1x2  elements  coupled  together. 
The  structure  of  the  DBS  guarantees  that  each  bypass  exchange  switch  has  only  one  signal 
propagating  through  it,  and  that  the  majority  of  the  cross-talk  terms  exit  from  the  ports  that  are 
not  utilized.  It  can  be  shown  that  the  remaining  cross-talk  is  now  proportional  ^  Under  the 
assumption  ^,£<<1,  cross-talk  is  greatly  decreased  and  SNR  increased. 

A  free-space  DBS  can  be  implemented  with  a  combination  of  lenslet,  BCGH  and  polarization 
rotator  elements  (see  Figure  3).  As  opposed  to  the  bypass-exchange  switch  implementation,  the 
input  signals  to  the  DBS  are  spatially  displaced  from  each  other  and  may  be  directed  using  off- 
axis  Fresnel  lenslets  fabricated  from  isotropic  material.  This  allows  the  single-substrate  BCGH, 
which  can  be  highly  sensitive  to  fabrication  error,  to  be  made  using  a  relatively  simple  linear 
phase  encoding.  One  polarization  rotator,  whose  polarization  state  determines  whether  the  switch 
will  function  in  bypass  or  exchange  mode,  can  control  both  signals’  polarization.  The  first  BCGH 
elements  have  one  phase  encoding  for  a  vertical  polarization  state  (e.g.  bypass  mode)  and  a 
different  phase  encoding  for  the  other  orthogonal  polarization  state.  The  second  set  of  BCGH 
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elements  have  the  conjugate  functionality  as  the  first  set,  serving  to  recollimate  the  beams.  The 
second  polarization  modulator  is  set  in  the  same  state  as  the  first,  so  that  the  output  polarization 
state  is  also  identical  to  the  input  state.  The  final  lenslets  direct  the  output  beams  according  to  the 
interconnection  pattern  required  by  the  system  architecture.  Linear  cross-talk  terms,  which  exit 
the  DBS  with  a  polarization  state  that  is  orthogonal  to  that  of  the  desired  signals,  can  be  filtered 
out  with  a  polarizer. 


Figure  3.  Optical  implementation  of  DBS  switch  is  made  up  of  four  BCGH  elements. 
The  order  of  elements  on  the  left  and  right  hand  sides  are  symmetric  around  the  central 
line  of  symmetry.  Input  beams  are  independent. 


The  DBS  complexity,  while  mitigating  the  linear  cross-talk  problem,  increases  the  number  of 
components  required  for  the  same  functionality  as  the  BBS.  However,  taking  advantage  of  the 
symmetry  of  the  DBS  (see  Figure  3)  and  the  three-dimensional  functionality  of  our  free-space 
optical  elements  can  reduce  the  complexity  of  these  switches.  This  is  done  by  introducing  a 
propagation  direction  component  along  the  vertical  axis,  i.e.  a  small  incidence  angle,  as  well  as 
placing  a  mirror  at  the  line  of  symmetry.  The  input  beams  will  pass  through  a  lenslet-rotator- 
BCGH  combination  at  one  elevation  and  react  according  to  the  encoded  information  at  that 
location.  Upon  reflection  from  the  mirror  the  beam  passes  through  another  BCGH-rotator-lenslet 
combination  at  a  lower  elevation  (see  Figure  4).  By  folding  the  switch  in  this  manner,  similar 
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elements  (i.e.  BCGH,  lenslets  and  polarization  rotators)  are  located  in  the  same  plane.  Therefore, 
a  DBS  may  be  fabricated  using  a  mirror  and  2x2  arrays  of  BCGH,  lenslets  and  polarization 
rotator  elements. 


Figure  4.  Folded  optical  dilated  bypass  exchange  switch  locates  like  elements  into  2D 
arrays.  Micro-mirrors  can  be  used  to  filter  high  diffraction  order  noise. 

The  advantage  of  this  folding  technique  is  further  enhanced  when  applied  to  an  optical  MIN. 
By  placing  a  mirror  at  the  output  of  the  first  folded  DBS,  the  beam  will  reflect  back  at  a  lower 
elevation  and  be  coupled  into  subsequent  DBS  located  below  the  first.  In  this  manner  all  similar 
elements  of  multiple  DBS  may  be  combined  into  two-dimensional  arrays,  minimizing  the 
number  of  components  required  for  the  entire  MIN:  a  single  lenslet  array,  a  BCGH  array,  a 
polarization  rotator  array  and  a  pair  of  folding  micro-mirror  arrays.  A  folded  optical  MIN  can  be 
packaged  as  a  resonator,  where  each  round  trip  represents  a  stage,  and  all  stages  are  stacked 
vertically  (see  Figure  5).  An  input  signal  beam  enters  the  system  at  a  small  angle  and  reflects 
through  a  prescribed  number  of  stages  before  exiting  in  the  desired  spatial  output  channel. 

For  compactness  and  optimal  use  of  available  fabrication  technologies,  we  use  computer¬ 
generated  hologram  (CGH)  off-axis  Fresnel  lenslets.  BCGH  and  CGH  lenslets  are  diffractive 
elements,  whose  diffraction  efficiency  dictates  the  amount  of  unwanted  higher  diffraction  terms 
produced.  Using  continuous  mirrors,  these  unwanted  orders  may  propagate  within  the  MIN, 
resulting  in  additional  cross-talk.  However,  by  using  micro-mirrors  deposited  on  a  transparent 
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substrate,  only  the  desired  diffraction  terms  from  the  BCGH  will  reflect  back  for  further 
propagation,  while  the  unwanted  noise  terms  exit  the  system. 


Control 


Micro-Mirror 


Figure  5.  Folded  optical  multistage  interconnection  network  shows  compact  packaging 
using  2D  arrays  of  optical  elements.  For  an  8x8  folded  MIN  there  are  three  layers  of 
DBS,  which  are  shown  as  separate  rows  on  the  2D  arrays. 

The  arrangement  of  the  optical  elements  in  2-D  arrays  also  allows  for  relatively  simple 
alignment  of  the  system  components.  Correct  alignment  will  dictate  that  during  each  pass 
through  the  cavity  the  beams  will  land  on  the  correct  elements.  Based  on  geometrical  ray  tracing, 
the  displacement  of  each  beam  from  its  correct  position  and  the  size  of  the  beam  at  the  BCGH 
elements  (i.e.  larger  or  smaller  than  the  predicted  size  at  the  element)  will  indicate  which  optical 
elements  (BCGH,  micro-mirrors,  etc.)  are  incorrectly  positioned.  Since  the  mirror  planes  are 
mostly  transparent,  beam  propagation  within  the  cavity  can  be  viewed  with  the  use  of  external 
imaging  optics  and  a  CCD  camera.  The  beam  size  and  position  was  monitored  in  situ  allowing 
for  accurate  alignment  of  optical  elements  and  mirror  planes. 

The  use  of  space-variant  lenslets  in  each  polarization  selective  element  allows  for  the  design 
of  arbitrary  connection  patterns  such  that  any  network  topology  may  be  implemented.  In  the 
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folded  optical  MIN,  the  number  of  channels  and  the  interconnection  architecture  used  dictates  the 
size  of  the  arrays  but  does  not  increase  the  number  of  components.  For  example,  an  8x8  optical 
MIN  architecture  (of  log28  =  3  stages)  requires  arrays  of  size  8x6  in  BCGH  and  polarization 
rotator  elements. 

3.  Switching  Protocols 

A  fully  connected  network  provides  the  connectivity  such  that  any  input  can  connect  to  any 
output.  These  networks  can  be  blocking  if  there  are  internal  contentions  for  links  with  existing 
network  connections.  Networks  are  rearrangeably  non-blocking  if  any  idle  input  may  be 
connected  to  any  idle  output  provided  that  we  may  rearrange  existing  connections  [1].  A 
rearrangeably  non-blocking  network  can  also  be  achieved  by  cascading  two  fully  connected 
networks.  Thus,  fully  connected  networks  are  smaller  and  have  simple  routing  algorithms,  but 
may  incur  blocking  situations  which  need  to  be  resolved  on  the  protocol  level.  Rearrangeably 
non-blocking  networks  can  accommodate  any  interconnection  pattern  at  the  expense  of  a  larger 
network.  Routing  is  more  complex  for  these  networks,  and  short  breaks  in  communication  may 
result  when  reconfiguring  a  network. 

For  applications  that  require  a  MIN  system  of  limited  size  and  complexity  it  may  be 
convenient  to  implement  a  fully  connected  network.  For  centralized  control  of  this  blocking 
system  we  have  developed  a  novel  switching  protocol  to  handle  routing  path  contentions.  The 
basic  blocking  system  architecture  is  that  there  are  N  users  accessing  M  data  sources  (e.g.  optical 
or  magnetic  disks)  via  a  server  regulated  network.  The  problem  is  that  given  n  arrivals  of 
information  at  times  tu  t2,  ...  tn  at  any  node  in  an  interconnection  network  the  protocol 
controlling  the  network  must  decide  the  order  of  service  of  all  N  users  in  order  to  avoid 
contention  and  provide  quality  of  transmission.  Previous  work  on  these  kinds  of  blocking  or 
conflicting  transmission  problems  have  used  non-real  time  and  real  time  scheduling  disk  access 
policies.  The  negotiated  techniques,  to  assure  stability  of  the  network,  only  give  conditions  on 
buffer  size  and  load  limited  to  a  single  disk  without  describing  the  transient  behavior  of  the 
system.  The  analysis  of  these  policies  has  generally  been  done  by  deterministic  approaches  which 
doesn’t  allow  a  rigorous  testing  of  the  suitability  of  these  protocols.  Moreover,  no  tangible  results 
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have  been  found  in  the  analysis  of  these  protocols  in  scheduling  and  buffer  allocation  for  disk 
arrays. 

The  gated-hold  protocol  that  has  been  developed  by  Paul  Dietrich  and  Prof.  Ramesh  Rao  (see 
Appendix  2  and  3)  has  been  rigorously  analyzed  using  stochastic  methods.  This  protocol,  in  a 
cycle,  will  guarantee  the  delivery  of  all  requests.  The  gated-hold  protocol  compares  favorably  to 
other  blocking  network  protocols.  However,  the  overhead  imposed  by  the  gated-hold  protocol  is 
characterized  by  the  hardware  complexity.  Depending  on  the  hardware  configuration  a  different 
method  of  analysis  may  be  required. 

Most  studies  that  have  been  done  in  scheduling  of  disk  arrays  have  been  done  using  computer 
simulations  or  deterministic  approaches,  which  doesn’t  always  describe  the  random  nature  of 
conmiunication  systems,  which  are  characterized  by  burstiness  at  peak  loads.  We  are  developing 
a  stochastic  model  that  will  give  us  a  better  representation  of  these  systems.  This  will  allow  us  to 
quantify  the  characteristics  of  a  given  system  and  then  to  derive  an  optimal  protocol. 

Another  issue  that  we  continue  to  investigate  is  the  seek-delay  time  of  a  disk.  Data  is  stored 
in  blocks  on  the  disk  and  a  certain  time  is  needed  for  the  disk  head  to  locate  that  block  on  the 
disk.  We  are  investigating  disk  access  policies  on  many  issues  in  relation  to  minimizing  the 
amount  of  overall  seek  delay.  These  issues  include  storing  information  contiguously  or  non- 
contiguously,  coordination  of  disk  movement  and  service  of  requests  on  single  and  multi-disk 
systems.  As  an  example,  with  the  assumption  of  uniform  arrivals  of  N  users  (which  is  considered 
the  worst  case)  accessing  a  single  disk,  we  have  established  the  expected  dead-time  expression 
for  a  range  of  scheduling  policies.  Given  the  dead-time  that  we  can  tolerate  and  the  rotation 
speed  of  the  read  and  write  disk  head,  we  can  find  the  buffer  size  requirement  (or  vice  versa). 

We  are  also  interested  in  multimedia  networks,  which  may  impose  a  totally  new  set  of 
requirements  onto  a  system  protocol.  The  combination  of  text,  image,  video  and  voice  may 
require  varying  information  packet  sizes,  transmission  and  access  speeds  as  well  as  varying 
priorities  for  different  information  types  and  packets. 
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4.  Polarization  Selective  Computer  Generated  Holograms 

One  of  the  key  technological  components  in  our  transparent  photonic  MIN  is  the  birefringent 
computer  generated  hologram  (BCGH)  array.  These  diffractive  optical  elements  (DOE)  have 
independent  impulse  responses  for  two  orthogonal  linear  polarizations.  Two  degrees  of  freedom 
are  required  in  order  to  encode  two  different  phase  functions  into  one  diffractive  element.  In  its 
original  design,  the  BCGH  consists  of  two  surface  relief  substrates  with  at  least  one  of  them 
birefringent.  The  first  birefringent  substrate  introduces  a  relative  phase  delay  in  each  pixel 
between  the  two  orthogonal  linear  polarizations.  The  second  isotropic  substrate  controls  the 
absolute  phase  of  both  polarization  components.  The  use  of  two  independent  etch  depths 
provides  the  two  degrees  of  freedom  necessary  for  independent  impulse  responses.  Previously, 
we  have  demonstrated  BCGH  with  different  functionality  using  two  birefringent  LiNb03 
substrates  [17].  The  fabricated  BCGH  elements  showed  high  diffraction  efficiency  (as  high  as 
60%)  and  large  polarization  contrast  ratio  (>  100:1). 

Due  to  the  special  configuration  of  a  two  substrate  BCGH  element,  different  analyses  are 
required  to  understand  the  relationship  between  the  performance  of  a  BCGH  and  the 
imperfection  introduced  during  the  fabrication  processes.  Using  both  scalar  diffraction  analysis 
(i.e.,  Fourier  analysis)  and  rigorous  vector  field  analysis  (i.e.,  rigorous  coupled  wave  analysis), 
we  found  that  two  substrate  BCGH  elements  are  more  sensitive  to  fabrication  imperfections 
compared  to  regular  diffractive  optical  elements.  Our  numerical  analyses  indicate  that  to 
construct  a  high  performance  two  substrate  BCGH,  tighter  fabrication  tolerance  is  required  in 
terms  of  exposure  dosage,  etch  depth  accuracy  and  alignment  accuracy. 

The  difficulties  we  encountered  in  making  two  substrate  BCGH  serve  as  one  of  the 
motivations  for  us  to  investigate  new  approaches  to  make  a  BCGH  element.  One  approach  we 
took  is  to  increase  the  etch  depth  in  a  single  birefringent  substrate  (see  Appendix  4)  so  that  the 
phase  delay  caused  by  the  etched  pixel  compared  to  an  unetched  one  is  more  than  2%.  We  name 
such  elements  multiple  order  delay  (MOD)  holograms.  Inside  a  MOD  single  substrate  BCGH 
element,  each  pixel  of  the  microstmcture  is  deep-etched  such  that  propagating  optical  waves  will 
exhibit  multiple  periods  of  phase  delays.  Therefore,  the  variable  order  of  phase  delay  provides  us 
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another  degree  of  freedom,  in  addition  to  the  etch  depth,  to  encode  two  phase  functions. 
Compared  to  a  two  substrate  BCGH,  a  MOD  element  has  the  advantages  of  simpler  element 
configuration,  relative  ease  of  fabrication,  and  may  yield  higher  performance  in  terms  of 
polarization  contrast  ratio  and  diffraction  efficiencies. 

We  designed,  fabricated  and  evaluated  both  experimentally  and  numerically  a  binary  phase 
level  MOD  with  polarization  selectivity.  The  substrate  material  was  YVO4  which  has  refractive 
indices  of  no  =  2.0241  and  ng  =  2.2600  at  a  wavelength  of  0.5145|im.  The  hologram  is  designed 
to  be  a  polarization  beam  splitter,  i.e.,  transmit  one  polarization  straight  and  deflect  the 
orthogonal  polarization  at  an  angle.  The  element  was  etched  to  a  designed  depth  of  1.032  pm 
using  ion  beam  etching.  The  diffraction  efficiencies  measured  under  ordinary  and  extraordinary 
polarization  illumination  are  70.8%  into  the  zero  order,  37.4%  into  the  +lst  order,  and  38.9% 
into  the  -1st  order.  The  measured  polarization  contrast  ratios  are  79.7:1  at  zero  order,  33.0:1  at 
+lst  order  and  32.5:1  at  -1st  order.  We  also  simulated  the  performance  of  a  MOD  using  rigorous 
coupled  wave  analysis  (RCWA).  The  simulation  results  agreed  with  our  experimental  evaluation 
well. 

In  the  MIN  demonstration  we  use  four  phase  level  MOD  elements.  The  diffraction  efficiency 
was  experimentally  measured  at  55.3%  for  the  0  order  and  45.3%  for  the  +1  order,  with 
extinction  ratios  of  10:1  and  30:1,  respectively.  The  diffraction  efficiencies  into  the  +1  order 
were  better  than  the  binary  phase  element,  but  significantly  below  the  theoretically  predicted 
efficiency  of  80.5%  diffraction  efficiency  for  a  four  phase  level  DOE.  The  relatively  poor 
efficiency  can  be  attributed  to  errors  in  etch  depth  during  the  fabrication  process.  In  the  MOD 
approach,  etch  depth  must  be  controlled  with  a  higher  degree  of  accuracy  than  that  used  for 
conventional  DOE.  This  is  because  the  error  introduced  by  the  over/under  etch  is  determined  by 
the  ratio  of  the  percent  of  etch  error  to  a  fraction  of  the  total  etch  depth  that  is,  in  effect, 
responsible  for  encoding  the  desired  phase  values  on  a  given  pixel.  Therefore,  the  multiple  phase 
level  elements  were  more  susceptible  to  fabrication  errors  due  to  an  inconsistent  ion  beam 
etching  process. 

We  have  also  applied  the  multiple  order  delay  approach  to  single  substrate  BCGH  which  are 
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color  selective  (see  Appendix  5).  We  designed  and  fabricated  a  binary  phase  level  color  selective 
beam  splitter  for  wavelengths  1.30  (xm  and  1.55  |xm.  The  substrate  material  used  was  BK7  glass 
which  has  index  of  refraction  1.5027  at  1.30  iim  and  1.5004  at  1.55  |xm.  The  diffraction 
efficiency  was  39%  into  the  +1  and  -1  diffraction  orders  and  0.83%  0  order  transmission  for  1.55 
pm  wavelength.  For  1.30  pm  wavelength  light  the  0  order  transmission  was  83%  with  less  than 
1.2%  diffraction  into  the  other  orders. 


Figure  6.  Micro-graph  of  FBCGH  linear  grating.  Phase  encoding  is  performed  using  a 
form-birefringent  based  sub-wavelength  grating  structure. 

Another  approach  to  BCGH  technology  can  be  implemented  using  form  birefringent 
subwavelength  gratings  (FBCGH)  on  single  isotropic  substrates  (see  Appendix  6, 7,  8  and  9). 
Electric  fields  parallel  to  the  grating  (TE  polarization)  and  perpendicular  to  the  grating  (TM 
polarization)  need  to  satisfy  different  boundary  conditions,  resulting  in  different  effective 
refractive  indices  for  the  two  orthogonal  polarizations.  It  has  been  shown  that  the  birefringence 
possible  using  this  approach  can  be  several  times  greater  than  naturally  occurring  materials. 

Using  a  combination  of  effective  medium  theory  (EMT)  and  RCWA  we  designed,  fabricated  and 
characterized  a  linear  grating  using  a  binary  phase  FBCGH  (see  Figure  6).  The  experimentally 
measured  diffraction  efficiencies  were  75%  for  the  0  order,  41.4%  for  the  +l  order  and  44.2%  for 
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the  -1  order,  with  extinction  ratios  of  88:1, 275:1  and  99:1,  respectively.  The  form  birefringent 
structures  also  serves  as  an  antireflection  coating,  explaining  the  slightly  higher  measured 
diffraction  efficiencies  compared  with  that  predicted  by  scalar  diffraction  theory  for  a  binary 
phase  element  (40.5%). 

5.  Polarization  Rotation  Devices 

The  reconfiguration  of  our  transparent  switching  fabric  is  accomplished  with  electrooptic 
(EO)  polarization  rotation  devices.  The  ‘folded’  configuration  of  our  network  demonstration 
requires  a  2D  array  of  elements,  which  can  currently  be  best  implemented  using  a  commercial 
liquid  crystal  (LC)  device.  However,  our  research  continues  on  development  of  a  new  generation 
of  polarization  rotation  devices  implemented  using  electrooptic  crystals  or  ceramics,  such  as 
lanthanum-modified  lead  zirconate  titanate  (PLZT). 

The  implementation  of  our  transparent  switching  fabric  for  use  in  an  optical  MIN  can  take 
various  forms.  As  an  example,  a  set  of  users  may  wish  to  access  optical  information  stored  on  a 
set  of  optical  disks.  The  MIN  can  be  configured,  allowing  for  routing  conflicts  (i.e.  blocking),  so 
that  information  can  pass  from  the  disks  to  the  users  as  fast  as  the  signal  can  be  generated.  The 
limit  to  the  speed  of  access  comes  from  the  active  components  of  the  system.  At  present  the 
greatest  latency  in  such  a  system  would  come  from  the  access  time  of  the  optical  disks  (on  the 
order  of  milliseconds).  However,  in  the  future  other  optical  memory  devices  may  be  available 
which  will  allow  much  faster  access  times.  In  that  case  the  switching  speed  of  the  polarization 
rotation  devices  within  the  MIN  could  be  the  limiting  factor. 

The  ferroelectric  LC  device  from  Display  Tech  that  we  are  currently  using  has  a  response 
time  on  the  order  of  a  hundred  microseconds.  The  other  device  parameters  that  are  important  are 
the  contrast  ratio,  transmittance  and  cross-talk  between  array  pixels.  In  testing  of  a  10  x  10  array 
designed  for  infrared  operation  we  found  that  the  cross-talk  between  1  mm^  pixels  was  minimal 
and  the  contrast  ratio  was  as  high  as  100:1.  Preliminary  testing  of  a  10  x  10  array  designed  for 
visible  light  has  shown  similar  characteristics.  Response  time  (as  well  as  contrast  ratio)  of  the 
material  is  a  function  of  the  speed  and  voltage  at  which  the  device  is  run.  Operating  at  67  Hz, 
with  ±15  Volts  applied,  the  infrared  device  had  a  response  time  of  135  |i.s. 
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In  order  to  decrease  the  reconfiguration  time  of  the  polarization  rotation  portion  of  the  MIN, 
alternative  materials  will  need  to  be  used.  We  have  been  investigating  the  feasibility  of  using 
PLZT,  an  electrooptic  ceramic  material,  as  a  replacement  for  the  ferroelectric  liquid  crystal. 
PLZT  9.0/65/35  has  a  very  strong  electrooptic  effect  combined  with  relatively  slim-loop 
hysteresis,  which  can  be  used  to  achieve  polarization  rotation  at  reasonable  switching  voltage  (Vjj 
«  150  Volts)  and  has  a  reported  response  time  on  the  order  of  100  nanoseconds. 

Our  investigation  has  found  that  the  design  of  an  efficient  polarization  rotation  device  based 
on  PLZT  must  contend  with  several  critical  design  parameters.  For  thin  wafers,  the  transmittance 
of  the  ceramic  is  very  high,  almost  100%,  however  using  thin  wafers  entails  using  surface 
electrodes  which  result  in  curved  electric  fields  and  inhomogeneous  phase  modulation  across  an 
incident  beam’s  wave  front.  Using  transverse  electrodes  (i.e.  on  the  sides  of  thick  wafers)  require 
larger  electrode  separation  and  thus  longer  interaction  lengths  (for  similar  operating  voltages) 
that  can  result  in  greater  scattering  (i.e.  lower  transmittance). 

Both  of  these  device  configurations  must  also  take  into  account  ferroelectric  hysteresis, 
photorefractive  effect,  depolarization  due  to  scattering  and  saturation  of  the  EO  effect  at  high 
electric  fields.  The  hysteresis  of  PLZT  can  be  reduced  by  operation  at  temperatures  of  about  70- 
80°  C.  Photorefraction  can  be  reduced  by  alternating  the  polarity  of  the  driving  voltage  at 
approximately  50  Hz.  However,  both  of  these  measures  reduce  the  magnitude  of  the  EO  response 
of  the  material  and  result  in  a  requirement  for  a  higher  operating  voltage.  High  voltage  operation 
results  in  greater  scattering,  depolarization  and  saturation  effects.  Device  design  to  achieve  the 
most  efficient  device  configuration  will  need  to  find  the  optimal  balance  for  all  of  these  factors. 

Design  testing  is  greatly  improved  by  using  computer  simulation  to  look  at  a  multitude  of 
parameter  changes.  We  are  using  a  finite  element  analysis  program,  in  conjunction  with  rigorous 
material  characterization  that  has  allowed  us  to  simulate  arbitrary  device  configurations  (see 
Appendix  10  and  11).  We  have  been  able  to  optimize  for  many  individual  parameters  and  are 
currently  working  on  modeling  that  can  optimize  for  multiple  parameters  to  achieve  optimal 
device  design  for  MIN  device  implementation. 
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For  narrow  beam  polarization  rotation,  we  have  found  that  a  transverse  electrode  geometry 
provides  the  best  combination  of  low  voltage  operation  and  uniform  phase  modulation.  Using  a 
push-pull  configuration  of  MOSFET  drivers  (see  Appendix  12),  we  have  built  a  PLZT  based 
polarization  rotator  capable  of  switching  a  square  wave  at  10  MHz.  In  Figure  7  we  see  that  the 
PLZT  rise  time  (top  curve)  of  30  ns  is  even  faster  than  the  driving  signal  (bottom  curve).  This 
faster  rise  time  is  due  to  a  latency  in  the  EO  response  of  the  material  at  low  voltage. 
Experimental  evidence  shows  that  latency  may  be  a  function  of  frequency,  device  geometry  and 
fabrication  techniques. 


Chi  Freq 
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Low  res 


Figure  7.  Bottom  curve  shows  the  1  MHz  driving  signal  of  160  V  amplitude  square  wave 
with  a  rise  time  of  approximately  100  ns  and  a  fall  time  of  80  ns.  Top  curve  shows  the 
14.  response  of  the  PLZT  where  the  rise  time  is  30  ns  and  the  fall  time  is  less  than  4  ns. 

In  order  to  understand  the  limits  of  the  switching  speeds  we  have  also  characterized  PLZT 
response  to  an  AC  signal  in  terms  of  composition,  applied  voltage,  temperature  and  device 
geometry.  Experimental  measurements  of  transverse  electrode  devices  has  shown  that  the  EO 
effect  exponentially  decays  to  about  20%  of  the  DC  strength  at  approximately  10  MHz  (see 
Figure  8).  The  two  peaks  at  3.1  and  6.5  MHz  are  amplified  EO  responses  due  to  acousto-optic 
resonances.  For  applications  that  can  take  advantage  of  specific  frequency  range  operation, 
device  dimensions  may  be  modified  to  provide  significantly  reduced  operating  voltages  or 
enhanced  EO  response  at  proscribed  frequencies  due  to  this  resonance  amplification  effect. 
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Figure  8.  EO  response  of  PLZT  8.8/65/35  from  DC  to  6  MHz  with  an  applied  ±75  V 
signal  amplitude.  Peaks  at  3.1  and  6.5  MHz  are  resonance  points. 

From  DC  to  the  10  MHz  range  the  EO  response  is  primarily  due  to  electrostrictive  effects.  At 
higher  frequencies,  ionic  and  electronic  displacements  are  dominant.  For  applications  that  can 
withstand  low  contrast  ratios  or  high  driving  voltages,  it  is  expected  that  PLZT  ceramic  material 
will  provide  response  times  well  below  1  ns.  At  UCSD  we  continue  investigation  of  the  high 
frequency  EO  response  of  PLZT. 


Figure  9.  (a)  Transmission  intensity  through  crossed  polarizers  for  single  150  pm  wide 
element,  at  1^.  or  the  ‘on’  state,  of  array  on  400  pm  pitch,  where  neighboring  elements 
are  also  in  the  ‘on’  state,  (b)  Transmission  intensity  when  neighboring  elements  are  in 
‘off  state,  i.e.  0  potential. 
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Implementation  of  an  array  of  PLZT  polarization  rotators  must  take  into  account  cross-talk 
between  rotator  elements.  Cross-talk  may  be  caused  by  electric  field  leakage  between  elements  or 
electrostrictive  or  piezoelectric  coupling  of  material  properties.  Using  finite  element  analysis 
modeling  we  have  investigated  electric  field  distributions  for  arrays  of  rotator  elements.  Figure 
9a  and  b  show  the  simulated  transmitted  intensity  for  a  single  element  in  an  array  when 
neighboring  elements  are  in  the  ‘on’  state  and  ‘off  state,  respectively.  The  intensity  is  directly 
related  to  the  strength  of  the  electric  field  within  the  PLZT  material.  It  can  be  seen  that  the  state 
of  each  element  has  a  strong  effect  on  the  electric  field  distribution  of  neighboring  elements. 
Experimental  measurements  on  fabricated  devices  show  similar  behavior.  Modification  of  device 
design  allows  for  the  significant  reduction  or  elimination  of  measurable  cross-talk. 

6.  System  Demonstration 
6. 1.  Folded  Multistage  MIN 

To  demonstrate  a  folded  system,  we  designed  and  constructed  an  8x8  optical  MIN  based  on  a 
fully  connected  Banyan  architecture  [1].  The  design  process  of  the  system  incorporates  the 
following  criteria:  (i)  maximization  of  the  number  of  rings  in  off-axis  Fresnel  lenslets  (ii) 
minimum  feature  size  of  diffractive  elements  conforming  to  available  fabrication  technologies 
(iii)  separation  of  diffractive  order  beams.  We  developed  a  Gaussian  beam  analysis  tool  of  the 
stable  mode  of  the  cavity  that  calculates  these  three  parameters  for  a  given  cavity  dimension. 

The  Gaussian  beam  spot  size  [19]  at  the  BCGH  plane  was  used  as  a  limiting  design 
constraint,  since  at  that  location  the  beam  size  is  largest.  For  a  spot  size  greater  than  the  lenslet 
(i.e.  array  pitch),  optical  power  would  leak  into  adjacent  elements  giving  rise  to  cross-talk.  A  spot 
size  much  smaller  than  the  BCGH  lenslet  would  result  in  diminished  diffraction  efficiency.  Our 
pitch  size  was  chosen  to  be  1  mm,  dictated  by  dimensions  of  the  pixelized  polarization  rotator 
device  used  (ferroelectric  liquid  crystal  (FLC)  device  from  DisplayTech,  Model  lOxlOB).  A 
beam  spot  size  of  .825  mm  was  used,  which  provides  for  minimal  cross-talk,  high  diffraction 
efficiency  and  power  throughput  (97%  of  the  beam  energy  is  contained  in  the  1x1 -mm  square). 
Beginning  with  a  cavity  size  of  200  mm,  where  the  CGH  array  was  placed  in  the  center  of  the 
cavity  (i.e.  100  mm  from  both  mirror  planes),  the  lenslet  focal  length  is  50.4  mm  with  a  75  pm 
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waist  size  at  the  mirror  planes.  The  1  mm  pitch  of  the  optical  elements  also  dictates  that  the  input 
light  beam  has  an  incidence  angle  of  0.5°.  The  polarization  rotator  array  is  placed  adjacent  to  the 
CGH  array  to  best  match  the  1  mm  pitch  of  the  FLC  elements  (see  system  schematic:  Appendix 
13). 

The  CGH  array  was  fabricated  in  a  quartz  substrate  using  eight-phase  levels,  with  diffraction 
efficiency  greater  than  90%.  Each  CGH  element  functions  as  an  off-axis  Fresnel  lenslet,  whose 
deflection  angle  is  dictated  by  the  interconnection  pattern.  For  our  8x8  Banyan  network  the 
largest  angle,  corresponding  to  shifting  the  beam  by  three  elements,  is  0.015  radians.  The  BCGH 
element  was  fabricated  using  4-phase  level  etching  in  YVO4  crystal,  selected  for  its  high  degree 
of  birefringence,  using  the  MOD  approach.  The  BCGH  elements  were  designed  to  use  the  0 
order  (no  phase  encoding)  for  bypass  mode  and  the  h-1  order  (linear  phase)  for  exchange  mode. 
The  diffraction  efficiency  was  experimentally  measured  at  55.3%  for  the  0  order  and  45.3%  for 
the  +1  order,  with  extinction  ratios  of  10:1  and  30:1,  respectively.  The  patterned  arrays  of  micro¬ 
mirrors  were  etched  onto  A1  film  evaporated  on  glass  substrates  with  average  measured 
reflectance  of  92%.  The  circular  micro-mirrors  have  diameters  of  150  [im,  double  the  calculated 
beam  diameter  at  the  mirror  planes  (small  enough  to  avoid  the  next  diffraction  order  beam). 

Experimental  testing  of  this  8x8  MIN  system  was  performed  using  a  488  nm  CW  Gaussian 
beam  generated  by  an  Ion  Argon  laser.  Initial  testing  used  one  optical  input  channel  modulated 
by  a  NEOS  Model  N71003  acousto-optic  (AO)  cell.  The  polarization  state  of  the  beam  as  it 
propagates  through  the  network,  which  dictates  the  propagation  path,  is  controlled  by  the 
DisplayTech  2-D  array  of  FLC  polarization  rotators.  Reconfiguration  of  the  FLC  elements  is 
computer  controlled  with  a  minimum  switching  speed  of  0.2  ms. 

For  8x8  interconnectivity  the  beam  reflects  through  the  system  3  round  trips  (i.e.  three  layers 
of  bypass-exchange  switches).  By  diverting  the  beam  after  one  or  two  passes  we  are  able  to  use 
the  same  experimental  system  to  test  the  performance  of  a  2x2  (single  DBS  switch)  or  4x4 
network.  Figure  10  shows  the  output  from  a  single  DBS  reconfiguring  a  single  input,  at  a  1  kHz 
rate,  between  two  output  channels,  where  the  input  is  a  50  kHz  square  wave  signal  (the  relatively 
slow  signal  was  used  to  allow  simultaneous  oscilloscope  triggering  of  both  frequencies).  The  top 
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trace  is  for  the  channel  1  signal  and  the  bottom  for  the  channel  2  signal.  Both  configurations  give 
a  >20:1  signal-to-noise  ratio  (SNR).  Results  using  a  10  MHz  signal  show  similar  SNR, 
highlighting  the  optical  transparency  of  the  system.  Testing  of  the  complete  8x8  system  resulted 
in  a  measured  SNR  of  better  than  10: 1. 


Figure  10.  Folded  DBS  output  shows  30:1  extinction  ratio  when  switching  signal 
between  two  channels.  Rise  and  fall  time  of  reconfiguration  is  approximately  100 
ps. 
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Table  1.  Linear  phase  BCGH  element  performance  shows  the  diffraction  efficiencies  for 
several  diffraction  orders.  The  zero  order  contrast  ration  is  less  than  10:1. 


The  relatively  low  SNR  for  this  system  was  primarily  due  to  two  factors:  (i)  Cross-talk  from 
unwanted  diffraction  orders  from  the  BCGH  (Table  1  summarizes  the  efficiency  per  diffraction 
order)  that  were  able  to  propagate  within  the  system.  The  filtering  of  the  micro-mirror  pattern 
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was  ineffective,  since  the  mirror  locations  matched  the  higher  diffraction  orders  in  the  design, 
allowing  coupling  of  unwanted  signals  into  other  channels,  (ii)  Low  contrast  ratio,  for  the  two 
polarization  states,  in  the  0  order  of  the  BCGH.  In  exchange  mode,  where  each  switch  was 
designed  to  propagate  only  the  +1  diffraction  order,  the  residual  0  order  component  gives  rise  to 
strong  cross-talk  within  the  dilated  switch. 

6. 2.  Reduced  Cross-talk  MIN 

Addressing  the  shortfalls  of  the  first  generation  folded  MIN,  a  redesign  was  initiated  to 
decrease  cross-talk.  New  BCGH  elements  were  designed  to  use  the  +1  diffraction  orders  for 
bypass  mode  and  the  -1  orders  for  exchange  mode.  Also,  the  functionality  of  the  lenslet  array  and 
the  BCGH  were  combined  to  form  a  single,  more  complex  optical  element  (see  Figure  11). 


Figure  1 1.  Microscopic  photograph  of  new  single  BCGH  element  (part  of  an  8x6  array) 
where  two  Fresnel  lenslets  are  encoded  into  a  birefringent  material  so  that  each  lenslet 
responds  to  one  of  two  orthogonal  polarization  components. 

This  new  design  has  the  following  advantages:  (i)  Only  the  high  contrast  ratio  ±1  orders  are 
allowed  to  propagate  within  the  resonator.  The  non-diffracted,  0  order  light,  which  has  relatively 
large  residual  component,  is  filtered  out  of  the  system,  (ii)  The  unwanted  higher  order  diffraction 
light  is  dispersed  over  a  large  area  of  the  micro-mirror  plane  and  it  is  not  focused  onto  the  micro- 
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mirrors  (see  Figure  12a).  The  amount  of  optical  power  incident  on  adjacent  micro-mirrors  (i.e. 
noise)  and  reflected  back  into  the  system  is  very  small.  It  is  determined  by  the  ratio  of  the  area  of 
the  micro-mirror  to  the  area  of  the  diffracted  order  at  the  plane  of  the  mirrors  (see  Figure  12b). 
(iii)  The  number  of  components  inside  the  resonator  has  been  reduced  to  three;  the  polarization 
modulator,  whose  placement  tolerance  is  relatively  loose,  the  polarizer  to  filter  out  unwanted 
polarization  components,  and  the  BCGH  element.  This  reduces  losses  due  to  reflection  and 
further  facilitates  the  compact  packaging  and  simplified  alignment  of  the  system  (see  Figure  13). 


(a)  (b) 


Figure  12.  (a)  Output  of  new  combined  BCGH  element  shows  high  diffraction  orders 
dispersed  over  a  large  area,  (b)  Spatial  filtering  using  micro-mirrors  reflects  only  the 
wanted  +1  or-1  order  signals  and  allows  noise  to  exit  system. 

Similar  to  the  previous  system  design,  the  second  generation  design  has  three  basic  design 
criteria:  (i)  number  of  rings  in  the  lenslets  (ii)  minimum  DOE  fabrication  feature  size  and  (iii) 
separation  of  diffractive  order  beams  at  the  mirror  planes.  Choosing  a  minimum  beam  separation 
and  DOE  feature  size,  and  varying  the  cavity  dimensions  to  find  the  maximum  number  of  lenslet 
rings  (for  high  diffraction  efficiency)  optimizes  the  system  design.  With  a  minimum  feature  size 
of  10  pm  and  a  maximum  of  1  mm  diameter  lenslets,  the  new  BCGH  lenslets  have  F/#  50. 
Accounting  for  beam  propagation  through  multiple  optical  elements  (BCGH,  rotator,  and 
polarizer),  the  lenslet  focal  length  was  found  to  be  85.1  mm  with  a  300  pm  waist  size  at  one 
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mirror  plane  and  100  |xm  waist  size  at  the  other.  The  full  resonant  cavity  length  is  407  mm  with 
the  BCGH  element  placed  107  mm  from  the  back  side  mirror.  The  circular  micro-mirrors  have 
diameters  of  400  pm  and  150  pm,  slightly  larger  than  the  calculated  beam  diameter  at  the  mirror 
planes. 


Figure  13.  Picture  of  second  generation  folded  MIN  experimental  system  shows  resonant 
cavity  comprised  of  two  glass  substrates  with  A1  micro-mirrors.  Optical  elements  within 
the  cavity  are:  (i)  polarizer  to  filter  linear  cross-talk,  (ii)  FLC  array  for  reconfiguration  of 
interconnections  and  (iii)  BCGH  element  (mounted  on  a  glass  slide)  with  polarization 
selective  Fresnel  lenslet  encodings. 

Figure  14  shows  the  output  from  a  single  2x2  DBS  reconfiguring  between  bypass  and 
exchange  mode  at  a  1  kHz  rate.  There  are  two  input  signals,  one  operating  at  DC  and  the  other  an 
AO  modulated  signal  with  a  square  wave  at  20  kHz  (this  relatively  slow  input  signal  was  used  to 
allow  simultaneous  oscilloscope  triggering  of  both  AO  and  FLC  reconfiguration  frequencies). 
The  top  trace  is  for  the  channel  1  output  and  the  bottom  is  for  channel  2  output.  Both 
configurations  give  extinction  ratios  (for  one  input  signal,  defined  as  the  ratio  between  the  ‘on’ 
state  and  ‘off  state)  greater  than  59:1  and  SNR  (for  multiple  input  signals,  defined  as  the  ratio  of 
the  signal  to  the  noise  at  the  same  output,  i.e.  cross-talk)  of  greater  than  57:1.  Results  using 
signals  ranging  from  DC  to  10  MHz  show  similar  SNR,  highlighting  the  optical  transparency  of 
the  system.  The  extinction  ratio  for  the  DBS  is  significantly  better  than  the  individual  BCGH 
elements,  which  shows  how  the  DBS  reduces  cross-talk. 
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Figure  14.  Output  from  DBS  fabricated  using  new  BCGH  elements  give  60:1  extinction 
and  cross-talk  ratios. 

By  allowing  the  beams  to  make  two  round  trips  through  the  cavity,  a  4x4  MIN  was 
experimentally  characterized.  Using  a  single  DC  input  signal  switching  between  four  output 
channels,  we  measure  an  average  SNR  of  90:1  and  extinction  ratio  of  120:1.  To  further 
investigate  cross-talk  we  introduce  a  second  input  signal.  The  measured  output  amplitudes  are 
shown  in  Figure  16.  Most  notable  is  that  the  output  intensities  vary  dependent  on  the  output  as 
well  as  the  input  channel.  This  may  be  due  to  the  variation  of  the  Fresnel  lenslet  diffraction 
efficiency  for  different  polarization  states.  The  average  minimal  (i.e.  the  weakest  output  signal  to 
the  strongest  output  noise)  SNR  is  87:1. 

The  complete  8x8  interconnection  system  was  characterized  using  a  single  DC  input  signal 
switching  between  all  eight  output  channels.  The  output  plane,  where  output  signals  may  be 
coupled  into  optical  fibers,  was  imaged  onto  a  CCD  camera  (see  Figure  16).  The  average 
measured  SNR  was  better  than  30:1.  This  relatively  low  SNR  can  be  attributed  to  the  strong 
background  noise  and  small  dynamic  range  of  the  CCD  device  used.  We  performed  similar 
measurements  using  two  input  signals  switching  between  all  eight  output  channels  that  gave 
similar  SNR  results. 
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Figure  15.  Output  packets  from  4x4  switch  with  two  input  signals.  The  two  input  signals 
(A  and  B)  are  routed  to  the  four  output  channels  by  a  host  computer  controller.  Varying 
packet  output  intensity  is  due  to  a  greater  BCGH  diffraction  efficiency  for  the  vertical 
polarization  state  (compared  to  the  horizontal  polarization  state). 
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Figure  16.  Switching  a  single  DC  input  between  eight  output  channels  shows  an  average 
cross-talk  ratio  of  30: 1 . 


7.  Conclusions 


The  research  results  of  this  program  can  be  summarized  as  follows: 

(1)  We  have  developed  two  new  approaches  to  design  and  fabrication  of  birefringent 
computer  generated  holograms.  The  MOD  approach,  which  can  be  fabricated  on  single  substrates 
of  birefringent  materials,  was  designed,  fabricated  and  experimentally  tested.  This  design  showed 
excellent  diffraction  efficiency  and  extinction  ratios  for  the  binary  phase  elements.  However,  due 
to  ion  beam  etching  inconsistencies,  the  performance  of  the  four  phase  level  elements  was  well 
below  the  efficiency  predicted  by  scalar  theory.  The  form  birefringent  FBCGH  approach  was 
designed  using  a  combination  of  EMT  and  RCWA.  A  binary  phase  level  device  was  fabricated 
and  experimentally  characterized.  It  has  diffraction  efficiencies  higher  than  scalar  theory  predicts, 
which  indicates  that  these  elements  have  anti-reflection  functionality. 

(2)  We  have  developed  the  Gated-Hold  protocol  for  centralized  control  of  blocking  MIN. 
Unlike  previous  methods  which  have  been  analyzed  by  deterministic  approaches,  which  doesn’t 
allow  a  rigorous  testing,  the  gated-hold  protocol  has  been  rigorously  analyzed  using  stochastic 
methods.  This  protocol,  in  a  cycle,  will  guarantee  the  delivery  of  all  requests.  The  gated-hold 
protocol  compares  favorably  to  other  blocking  network  protocols.  However,  the  overhead 
imposed  by  the  gated-hold  protocol  is  characterized  by  the  hardware  complexity.  Depending  on 
the  hardware  configuration  a  different  method  of  analysis  may  be  required. 

(3)  We  have  designed  and  fabricated  PLZT  9.0/65/35  based  polarization  rotation  devices  that 
can  be  switched  at  faster  than  10  MHz  with  a  160V  square  wave  driving  voltage.  This  range  of 
composition  (8.8-9.65/65/35  PLZT)  has  been  extensively  characterized  for  frequency, 
temperature  and  geometry  dependent  EO  response.  The  use  of  a  novel  transverse  electrode 
geometry  allows  for  coupling  of  acousto-optic  effects  for  EO  amplification  at  design  frequencies. 
We  have  also  developed  a  FEA  based  model  of  PLZT  modulators,  which  successfully  predicts 
device  behavior  and  is  used  for  device  design  and  optimization. 

(4)  In  this  research  we  have  designed  and  implemented  an  optical  MIN  using  a  novel  folded 
optical  DBS  based  on  developed  BCGH  polarization  selective  technology.  We  have 
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demonstrated  how  this  design  allows  for  the  elimination  of  the  first-order  cross-talk,  ease  of  MIN 
system  alignment  and  packability  as  well  as  filtering  of  unwanted  high  diffraction  orders.  The  use 
of  space  variant  lenslets  also  allows  for  the  implementation  of  arbitrary  MIN  architectures. 
Comparison  of  -30:1  extinction  ratio  for  BCGH  elements  and  -60:1  extinction  ratios  for  the 
DBS  highlights  the  ability  of  the  dilated  switch  to  reduce  cross-talk. 

Our  initial  design  of  a  folded  DBS  was  based  on  switching  between  the  0  and  +1  diffraction 
orders  using  simple  linear  phase  encoded  BCGH  elements.  However,  this  design  allows  the 
strong  residual  component  of  the  0-order  light,  with  less  than  10:1  contrast  ratio,  as  well  as  high 
diffraction  orders  to  propagate  in  system,  which  resulted  in  an  average  SNR  of  10:1  for  the  8x8 
system.  An  improved  design  switches  between  the  +1  and  —1  diffraction  orders  and  combines  the 
space-variant  lenslets  and  polarization  selective  elements.  This  new  design  results  in  an 
improved  average  SNR  of  30:1  as  well  as  further  reducing  the  complexity  of  the  system. 

The  four-phase  level  BCGH  elements  with  low  extinction  ratios  were  the  limiting  factor  in 
improving  the  SNR  of  the  system.  This  poor  performance  can  be  attributed  to  inaccurate  etching 
depths  due  to  ion-etching  device  inconsistencies.  Significantly  higher  SNR  for  the  folded  system 
can  be  expected  with  improved  etching  facilities  and  the  use  of  alternative  BCGH  design 
approaches. 
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Polarization-controlled  multistage 
switch  based  on  polarization-selective 
computer-generated  holograms 


Ashok  V.  Krishnamoorthy,  Fang  Xu,  Joseph  E.  Ford,  and  Yeshayahu  Fainman 


We  describe  a  polarization-controlled  free-space  optical  multistage  interconnection  network  based  on 
polarization-selective  computer-generated  bolograms;  optical  elements  that  are  ^ 

Lbitraiy,  independent  phase  functions  on  horizontally  and  vertically  polanzed  monochromatic  1  g  ^ 
We  investigate  the  design  of  a  novel  nonblocking  space-division  photonic  switch  architecture.  Th 
multistage-switch  architecture  uses  a  fan-out  stage,  a  single  stege  of  2  x  2  swtchmg  elemente  an 
fan-in  stage.  The  architecture  is  compatible  with  several  control  strategies  that  use  1  x  2  and  2 
polarization-controlled  switches  to  route  the  input  light  beams.  One  application  of  fte  switch  is  n  a 
passive  optical  network  in  which  data  is  optically  transmitted  through  the  swatch  with  a  time-of-flig 
Lay  but  without  optical-to-electrical  conversions  at  each  stage.  We  have  built  and  characterized  a 
proof-of-principle  4X4  free-space  switching  network  using  three  cascaded  stages  of  arrayed  birefhngen 
computer-generated  holographic  elements.  Data  modulated  at  20  MHz/channel  were  t^ansmi  e 
through  the  network  to  demonstrate  transparent  operation.  ©  1997  Optical  Society  of  America 


1.  Introduction  and  Background 

There  is  a  growing  need  in  the  telecommunications 
and  data-communications  industry  for  a  scalable 
switch  that  can  provide  high-throughput  communi¬ 
cation  between  a  large  number  of  input-output  (I/O) 
ports.  Recent  advances  in  the  area  of  fiber  amplifi¬ 
ers  has  spurred  interest  in  transparent  optical  net¬ 
works,  wherein  communication  between  users  is 
achieved  without  multiple  conversions  between  the 
optical  and  electrical  domains.^  The  implementa¬ 
tion  of  16  X  16  and  larger  switches  in  a  number  of 
optical  technologies  is  currently  being  pursued. 
Moreover,  polarization  compensators  have  been  de¬ 
veloped  for  single-mode  fibers  to  permit  automatic 
and  stable  control  of  the  polarization  state  of  output 
optical  signals.2  One  might  thus  envision  a  switch¬ 
ing  system  that  uses  polarization-cfepe/idenf  optical 
switches.  Because  of  its  low-delay,  high-throughput 
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characteristics,  such  a  switch  may  also  find  applica¬ 
tions  in  a  tightly  coupled  multiprocessor  networking 
system  or  a  parallel-processor-to-memory  intercon¬ 
nection.  In  fact,  polarization  switching  has  been 
widely  proposed  for  use  in  the  context  of  free-space 
optical  multistage  interconnection  networks^*^  for 
switch  sizes  up  to  256  X  256.® 

In  this  paper  we  describe  a  novel  free-space 
polarization-controlled  optical  switch  design  and 
present  the  implementation  of  a  4  X  4  photonic 
switch.  The  potential  advantages  of  this  design  in¬ 
clude  no  bulky  birefringent  optical  components,  fewer 
optical  surfaces  resulting  in  lower  insertion  loss,  no 
path-dependent  attenuation  nonuniformity,  a  revers¬ 
ible  optical  path,  and  greater  flexibility  in  choosing 
the  optical  interconnect  topology  and  the  resulting 
switch  architecture.  The  switching  system  is  based 
on  a  unique  polarization-selective  optical  element  ca¬ 
pable  of  acting  with  an  arbitrary  independent  phase 
function  on  illumination  with  horizontally  or  verti¬ 
cally  polarized  monochromatic  light.  This  element, 
known  as  a  birefringent  computer-generated  holo¬ 
gram  (BCGH),  is  composed  of  two  birefringent  sub¬ 
strates  etched  with  surface-relief  patterns  and  joined 
face  to  face 

In  Section  2  we  review  the  BCGH  technology  and 
discuss  its  application  to  the  basic  2X2  switch.  In 
Section  3  we  describe  a  new,  nonblocking  multistage- 
switch  architecture  that  is  well  suited  to  a  photonic 
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Fig.  1 .  Principle  of  operation  of  a  polarization-selective  hologram: 
The  lenses  represent  two  imaging  stages  of  which  the  first  imaging 
stage  places  one  polarization  on  each  Fourier-plane  hologram  and 
the  second  combines  the  outputs  of  the  two  polarizations.  Here, 
4-/* imaging  is  required  to  transfer  both  the  amplitude  and  phase  of 
the  incident  wave  front  accurately.  A  6CGH  combines  the  func¬ 
tionality  of  polarization  beam  splitters  and  associated  interconnec¬ 
tion  optics  in  a  single  planar  element. 


implementation  with  BCGH  technology.  In  Section 
3  we  also  present  a  discussion  of  system  trade-ofis 
and  a  comparison  of  the  architectures  with  other 
well-known  multistage-switch  designs.  In  Section  4 
we  present  the  implementation  and  characterization 
of  a  4  X  4  photonic  BCGH  switch.  In  Section  5  we 
provide  a  summary  and  conclusions. 

2.  Application  of  Polarization-Selective 
Computer-Generated-Hologram  Technology 
to  a  2  X  2  Switch 

A.  Birefringent  Computer-Generated-Hologram 
Technology 

Multistage  interconnection  networks  (MIN’s)  using 
polarization  switching  can  be  built  with  polarizing 
beam  splitters  (BBS’s)  and  other  bulk  optics.  How¬ 
ever,  system  costs  and  complexity  limit  the  number  of 
stages  and  therefore  the  network  size.  It  is  possible 
to  simplify  the  system  substantially  and  eliminate 
many  of  the  optical  alignments  by  replacement  of  the 
discrete  optical  components  with  polarization- 
selective  planar  holograms  (Fig.  1).  A  PBS,  imaging 
lenses  (L),  and  two  computer-generated  holograms 
(CGH’s)  can  be  replaced  by  a  single  polarization- 
selective  CGH  that  has  a  different  phase  profile  for 
each  of  the  two  orthogonal  linear  polarizations  (Fig. 
2).  Polarization-selective  holograms  have  been  fab¬ 
ricated  with  various  techniques,  including  optical  re¬ 
cording  of  dichromated  gelatin,®-^®  organic  media, 
and  photorefractive  crystals,  well  as  lithographic 
recording  of  polarization  foil.'®  However,  we  are  pri¬ 
marily  interested  in  a  particular  type  of  polarization- 
selective  hologram— the  birefringent  CGH,  that  is,  a 
CGH  fabricated  in  birefringent  media.''*”'” 

A  conventional  kinoform  CGH  is  a  two-dimensional 


Fig.  2.  Schematic  diagram  of  the  construction  of  BCGH's  by  the 
placement  of  two  thin  holographic  elements  face  to  face.  At  least 
one  of  the  holograms  is  etched  in  an  anisotropic  medium  (e.g., 
LiNOa).  H,  horizontally;  V,  vertically;  pol,  polarized. 


(2-D)  phase  profile  that  transforms  the  input  light 
(e.g.,  a  plane  wave)  into  the  desired  output  (e.g.,  an 
array  of  points).  The  desired  continuous  phase  pro¬ 
file  is  first  computed  and  then  reduced  to  a  minimum 
of  data  by  pixellation,  truncation  to  modulo  27r,  and 
quantization  into  discrete  values.  This  data  array  is 
then  used  to  fabricate  the  hologram.  A  BCGH  is 
similar  in  function  to  a  conventional  phase-only  kino¬ 
form  CGH,  except  that  a  BCGH  has  a  different  phase 
profile  for  each  of  the  two  orthogonal  linear  polariza¬ 
tions  that  illuminate  the  hologram.  One  fabricates 
kinoform  phase-only  CGH’s  by  etching  a  surface- 
relief  profile  into  an  isotropic  glass  substrate.  In  a 
BCGH,  the  surface-relief  profile  is  etched  into  a  bire¬ 
fringent  substrate.  The  birefringent  substrate  pro¬ 
vides  different  indices  of  refraction,  depending  on  the 
input  polarization,  that  are  used  to  differentiate  be¬ 
tween  horizontally  and  vertically  polarized  inputs. 
The  information  content  of  the  two  arbitrary  phase 
functions  is  contained  in  two  etched  substrates, 
joined  face  to  face  so  that  both  profiles  effectively  lie 
in  the  same  optical  plane  (see  Fig.  2).  These  two 
substrates  can  apply  an  arbitrary  phase  for  the  two 
orthogonal  linear  polarizations. 

The  operation  of  the  BCGH  can  be  explained  by 
consideration  of  the  case  in  which  one  substrate  is 
birefringent  and  the  other  is  isotropic  and  in  which 
the  polarization  of  the  incident  light  is  aligned  either 
along  or  perpendicular  to  the  optical  axis  of  the  bire¬ 
fringent  substrate.  A  ray  transmitted  through  the 
birefringent  substrate  will  have  a  different  phase  de¬ 
lay  for  each  polarization  because  the  indices  of  refrac¬ 
tion  differ.  At  each  pixel,  the  phase  angle  between 
the  two  polarizations  and  the  absolute  phase  delay  of 
the  rays  depends  on  the  thickness,  hence  the  etch 
depth,  of  the  birefringent  substrate.  This  etch  depth 
is  chosen  to  produce  the  desired  final  phase  angle 
between  the  two  polarizations.  The  ray  then  passes 
through  the  isotropic  substrate,  where  light  of  either 
polarization  is  delayed  by  the  same  phase  angle, 
again  depending  on  the  etch  depth.  This  etch  depth 
is  chosen  to  bring  one  polarization  to  the  desired 
phase  angle.  Because  the  relative  delay  between  po¬ 
larizations  is  unaffected  by  the  isotropic  substrate. 
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the  phase  angles  of  both  polarizations  are  simulta¬ 
neously  brought  to  the  final  desired  values.  The  re¬ 
sult  is  an  optical  element  that  can  have  high 
diffraction  efficiency  and  pro\nde  arbitrary  function¬ 
ality  for  each  of  the  two  orthogonal  linear  polariza¬ 
tions  of  the  input  light. 

The  BCGH  element  effectively  splits  the  input  light 
beams  by  polarization,  acts  independently  on  each 
beam  by  use  of  separate  CGH's,  and  recombines- 
redirects  the  output  beams  (see  Fig.  1).  The  process 
of  computing,  etching,  and  assembling  a  BCGH  is 
described  in  greater  detail  elsewhere.^^-^®  Methods 
of  introducing  artificial  anisotropy  in  a  transparent 
material  by  use  of  high-spatial-frequency,  surface- 
relief  nanostructures  are  also  being  investigated^®; 
such  investigation  will  ultimately  permit  BCGH  ele¬ 
ments  to  be  fabricated  on  a  single  substrate. 

B.  Use  of  a  Polarization-Selective  Computer-Generated 
Hologram  as  an  Optical  2x2  Switch 

Two  t3rpes  of  switch  can  be  used  for  MIN's:  1x2  and 
2x2  switches.  A  MIN  can  be  made  with  '9(log2  N) 
stages  of  N,  1  X  2  switches  per  stage  and  2N  links 
between  stages.^®  Switching  is  achieved  by  the 
choice  of  the  output  link  each  input  takes.  A  switch¬ 
ing  MIN  can  also  be  constructed  with  ^(log2  AO  stages 
of  N/2,  2x2  switches  per  stage  and  N  links  between 
stages.  Switching  is  achieved  by  the  choice  of  the 
state  of  each  2x2  switch.  This  latter  type  of  net¬ 
work  is  the  one  pursued  in  this  paper. 

As  shown  in  Figs.  3(a)  and  3(b),  it  is  often  conve¬ 
nient  to  build  a  2  X  2  switch  by  use  of  1  x  2 
switches.i^-21  Figure  3(c)  illustrates  the  allowed 
and  disallowed  states  of  the  switch  when  1x2 
switches  with  passive  combining  are  used  to  generate 
a  2  X  2  switch.  The  disallowed  configurations  of  a 
2x2  switch  correspond  to  both  inputs  accessing  the 
same  output.  For  the  case  of  a  BCGH  2x2  switch, 
this  situation  would  correspond  to  both  polarizations 
having  the  same  deflection  angle  at  the  output.  If 
the  inputs  to  the  2  X  2  switch  have  orthogonal  polar¬ 
izations,  then  the  outputs  will  also  have  orthogonal 
polarizations  when  the  axes  of  the  electro-optic  po¬ 
larization  modulator  are  aligned  so  as  either  to  pass 
both  polarizations  without  change  or  to  rotate  the 
polarizations  of  both  beams  by  90®.  Hence,  the  dis¬ 
allowed  configurations  can  be  avoided  if  one  ensures 
that  the  inputs  to  a  2  x  2  switch  have  orthogonal 
polarizations  and  if  a  0®  or  90®  polarization-rotating 
switch  is  used. 

A  2  X  2  polarization  switch  will  thus  require  two 
BCGH  planes  and  a  polarization-rotator  plane,  as 
shown  in  Fig.  4.  The  two  inputs  are  both  directed 
into  the  first  BCGH,  which  combines  and  focuses  the 
two  inputs  into  a  polarization  rotator  (PR).  After 
being  combined,  the  two  modulated  beams  propagate 
in  the  same  direction  through  the  PR;  this  step  is 
essential  to  obtaining  high-contrast  modulation  be¬ 
cause  polarization  rotators  are  strongly  angle  sensi¬ 
tive.  The  PR  sandwiched  between  the  two  BCGH 
elements  controls  the  state  of  the  2x2  switch,  i.e., 
either  a  crossover  or  straight-through  connection.  If 
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Fig.  3.  Fabrication  method  and  states  of  2  x  2  switches:  (a)  A  2  x 
2  switch  either  passes  the  inputs  straight  through  or  exchanges  the 
inputs,  (b)  A  2  X  2  switch  can  be  implemented  by  use  of  tw'o  1  x 
2  inputs  with  their  respective  outputs  tied  together,  (c)  Allowed 
and  disallowed  states.  Disallowed  states  must  be  carefully 
avoided.  In  a  BCGH  implementation  this  is  ensured  by  the  re¬ 
quirement  that  the  two  inputs  have  orthogonal  polarizations  and 
by  use  of  a  0®  or  90®  PR  switch. 


the  PR  is  in  the  off  state,  the  two  beams  propagate 
straight  through,  maintaining  their  original  polariza¬ 
tions.  When  the  PR  is  turned  on,  the  two  beams 
exchange  polarizations. 

After  transmission  through  (and  possibly  modula¬ 
tion  by)  the  PR,  the  second  BCGH  element  deflects 
the  beams  into  two  different  directions,  depending  on 
their  polarization  states.  Figure  4  shows  the  beam 


Fig.  4.  Components  of  a  2  x  2  switch:  two  BCGH’s  and  a  polar¬ 
ization  modulator.  y\  is  the  sw'itch  efficiency,  R  is  the  transmit¬ 
tance,  and  C  is  the  coupling  efficiency  associated  with  clipping 
losses,  which  are  incurred  when  imaging  a  beam  through  a  mod¬ 
ulator  aperture. 
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focused  through  a  small  aperture,  which  may  or  may 
not  be  necessary,  depending  on  the  modulator  tech¬ 
nology.  Note  that  a  BCGH  s\ntching  element  com¬ 
bines  the  functionality  of  a  2  X  2  switch  and  the 
associated  holographic  interconnection  optics  before 
and  after  the  2x2  switch.  A  2  x  2  BCGH  switch 
also  has  the  same  functionality  for  beams  propagat¬ 
ing  through  the  optics  in  reverse,  in  principle  making 
the  network  path  reversible. 

C.  Insertion  loss  calculation  for  a  2  x  2  Polarization 
Switch 

Optical  loss  in  a  BCGH  switch  is  due  to  reflections 
from  the  dielectric  surfaces  and  the  diffraction  effi¬ 
ciency  of  the  holograms.  To  increase  the  diffraction 
efficiency  of  a  BCGH  hologram,  and  thus  reduce  the 
insertion  loss  (attenuation)  of  a  BCGH-based  switch, 
multilevel  phase  BCGH’s  are  necessary.  The  theo¬ 
retical  diffraction  efficiency  of  a  CGH  is  proportional 
to  the  number  of  phase  levels  <I>  used  in  its  construc- 
tion.22  The  efficiency  of  a  BCGH  tibcgh  is  the  prod¬ 
uct  of  the  efficiencies  of  its  two  component  holograms: 


For  instance,  the  use  of  four  phase  levels  (^>o  =  4>6  == 
4)  in  each  of  the  holographic  elements  results  in  an 
optimum '  BCGH  difraction  efficiency  of  approxi¬ 
mately  64%.  Several  four-level  phase  BCGH  ele¬ 
ments,  each  consisting  of  a  4  x  4  array  of  blazed 
gratings,  were  designed  and  fabricated  for  applica¬ 
tion  to  the  MIN  described  below  in  Section  4.  The 
grating  periods  were  40  p.m,  and  the  smallest  feature 
size  in  the  hologram  was  10  p.m.  A  diffraction  effi¬ 
ciency  of  26%  and  a  polarization  contrast  ratio  of 
130:1  were  measured  for  the  four-level  phase  ele¬ 
ment.  These  values  can  be  compared  with  a  mea¬ 
sured  difraction  efficiency  of  12%  and  a  contrast  ratio 
of  approximately  50:1  with  a  binaiy  phase  element 
having  the  same  feature  size.  By  tilting  the  holo¬ 
gram  to  compensate  for  alignment  errors,  Xu  et  aZ.^® 
have  achieved  best-case  results  for  a  four-level  phase 
hologram  of  a  60%  diffraction  efficiency  and  a  160:1 
contrast. 

Surface-reflection  losses  at  the  BCGH  substrate  Rg 
and  the  modulator  substrate  contribute  to  the 
switch  loss.  A  clipping  loss  C  occurs  when  the 
beams  are  focused  through  an  aperture  at  the  mod¬ 
ulator.  The  total  switch  efficiency  is  then 

^switch  “  (2) 

If  we  assume  that  each  optical  surface  is  antire¬ 
flection  coated  with  a  single  dielectric  layer  (to  permit 
the  maximum  range  of  input  angles)  and  that  16- 
level  phase  holograms  are  used,  then  these  constants 
can  be  estimated  to  be  t]  =  98.4%,  Rg  =  R,n  —  99%, 
.  and  C  ==  98.6%.  The  total  switch  efficiency  is 
then  86.3%,  producing  an  insertion  loss  of  10 
logio(^switch)  =  -0-638  dB.  The  calculations  pre¬ 
sented  here  represent  the  switch  performance  of  a 


transmission  device.  Note  that  BCGH  components 
may  also  be  used  in  conjunction  wdth  smart-pixel 
technolog}\  Depending  on  the  particular  device 
technology,  this  combination  can  introduce  other  sur¬ 
faces  fe.g.,  a  common  substrate  that  holds  the  circuit 
and  the  modulator  materials). 

If  the  polarization  rotation  w^ere  exactly  'n-/2  and 
the  BCGH's  distinguished  completely  between  the 
two  polarizations,  the  cross  talk  would  be  zero  and 
the  signal-to-noise  ratio  (SNR)  of  a  2  x  2  switch 
w^ould  be  infinite.  In  practice,  one  can  define  the 
cross  talk  coming  from  one  swritch  to  be  5^  and  the 
maximum  number  of  switches  in  one  path  to  be  S. 
Then  the  SNR  of  the  entire  network  is  given  by 

SNR„etwork  =  logio(l/6c)  ’  logio  S.  (3) 

Note  that  6^  is  a  critical  factor  that  will  determine  the 
choice  of  architecture  and  maximum  network  size. 
In  this  paper  we  consider  two  examples:  8^  =  0.1% 
and  he  =  These  two  cases  are  typical  of  cur¬ 

rently  achievable  technology  for  bulk  and  pixellated 
BCGH  switching  elements. 

3.  Architecture  of  the  Birefringent 
Computer-Generated  Hologram  Multistage  Switch 

A.  Switch  Architecture 

The  Stretch  network  is  a  class  of  self-routing  MIN’s 
that  provides  a  continuous  performance- cost  trade¬ 
off  between  two  of  its  extreme  forms:  the  fully  con¬ 
nected  space-division  switch  (or  crossbar  swatch)  and 
the  Banyan  multistage  network.  Stretch  networks 
utilize  a  destination-tag-based  routing  algorithm; 
that  is,  for  each  input  channel,  the  necessary  I/O 
path  through  the  network  can  be  determined  on  a 
stage-by-stage  basis  solely  from  the  specified  desti¬ 
nation  of  the  input  packet.  Stretch  networks  can  be 
designed  to  achieve  a  low  delay  and  arbitrarily  low 
blocking  probabilities  for  various  traffic  patterns 
without  using  internal  buffers  in  the  switching  fabric. 
A  common  feature  of  all  Stretch  networks  is  that  each 
stage  in  the  multistage  switching  network  uses  a 
simple  perfect-shuffle  interconnection  or  any  of  the 
topologically  equivalent  connection  pattems.^^  In 
this  paper  we  are  concerned  with  a  specific  nonblock¬ 
ing  Stretch  network  with  N  I/O  channels  and 
A-shuffle  interconnection  between  stages.  The 
broader  class  of  Stretch  networks  is  described  in  more 
detail  elsewhere.^^ 

An  example  of  the  switch  architecture  for  =  8 
channels  is  presented  in  Fig.  5(a).  In  this  network, 
the  fan-out  (splitting)  stages  [Fig.  5(b)]  permit  partial 
contention-free  routing  of  the  first  log2(A^  “  1)  bits  of 
the  destination  address  for  each  of  the  N  inputs;  the 
switching  stage  provides  the  routing  on  the  last  bit  of 
the  destination  address,  and  the  fan-in  (combining) 
stages  [Fig.  5(c )]  concentrate  the  outgoing  data.  The 
fan-out  and  fan-in  stages  provide  contention-free  de¬ 
multiplexing  and  multiplexing,  respectively,  of  each 
input  signal.  The  fan-out  stage  is  connected  to  the 
switching  stage  by  use  of  a  two-shuffle  connection 
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Passive  Splitting 


Active  Combining 


Fig.  5.  (a)  Stretch  switch  with  eight  inputs-outputs,  a  fan-out  of 
4,  and  one  stage  of  2  x  2  switches.  The  switch  is  nonblocking,  and 
the  first-order  cross  talk  of  the  network  is  equal  to  the  cross  talk  of 
a  single  switch.  All  lines  represent  point-to-point  connections, 
(b)  Fan-out  modules  may  either  be  passive  or  active,  (c)  Fan-in, 
similarly  to  fan-out,  may  be  active,  passive  with  optical  fan-in,  or 
implemented  with  separate  detectors  and  electrical  multiplexing 
(Mux). 


pattern,  and  the  switching  stage  is  connected  to  the 
fan-in  stage  by  use  of  a  (iV/2)-shuffle  connection  pat¬ 
tern.  Notice  that  the  fan-out  for  each  input  is  N/2, 
which  is  half  the  fan-out  for  a  fully  connected  switch. 
A  single  stage  of  2  x  2  switches  is  used  in  the  center 
of  the  switching  fabric.  An  important  consequence 
is  that  the  switch  is  strictly  nonblocking,  with  a 
unique  constant-length  path  from  each  input  port  to 
each  output  port.  This  can  be  verified  visually  from 
Fig.  5(a)  and  is  proved  in  Ref.  25.  The  highlighted 
connections  in  Fig.  5(a)  show  the  path  from  a  specific 
input  to  each  of  the  output  ports. 

This  nonblocking  architecture  is  well  suited  for 
parallel  optical  implementation  because  it  uses  a  sin¬ 
gle  stage  of  N^/2,  2x2  switching  elements  and 
because  the  destination-tag-routing  property  of  the 
multistage  switch  results  in  a  simple  path-hunt  al¬ 


gorithm  that  can  readily  be  accomplished  in  parallel 
if  required.  The  structure  of  this  s^^dtch  is,  in  prin¬ 
ciple,  similar  to  the  extended  generalized  shuffle  net¬ 
work  described  in  Ref.  20,  except  that  the  Stretch 
network  has  an  exact  multiple  of  log  N  logical  stages 
(including  the  fan-out  modules)  between  the  input 
and  output  ports,  thereby  ensuring  a  self-routing 
structure,  hence  a  simple  routing  algorithm,  that 
may  be  applied  to  each  channel  independently  of  the 
others. 

The  implementation  of  the  fan-out  and  fan-in 
stages  in  an  optical  Stretch  network  is  critical  to  the 
network’s  performance.  The  fan-out  stage  may  be 
passive  (i.e.,  optical  broadcast)  [passive  splitting 
(PS)],  which  results  in  a  maximum  2/N  power  effi¬ 
ciency,  or  it  can  be  active  [active  splitting  (AS)]  i.e., 
built  by  use  of  a  tree-based  architecture  with  N/2 
additional  1x2  switches  per  fan-out  module.^e 
Similarly  the  fan-in  stage  can  be  active  (built  with 
2x1  switches)  [active  combining  (AC)],  passive  with 
optical  fan-in  [passive  combining  (PC)],  or  can  use 
N/2  separate  receivers  per  output  module  with  elec¬ 
tronic  multiplexing.  For  the  active  tree-based  fan¬ 
out  and  fan-in  modules,  the  control  lines  in  a  stage 
are  typically  tied  together  for  convenience;  hence 
each  module  will  require  log2(N  -  1)  control  lines  to 
control  N/2  switches. 

When  AS  is  used,  each  fan-out  stage  consumes  the 
first  log2(N  “  1)  bits  of  the  destination  address  of  the 
corresponding  inputs,  and  the  2x2  switching  stage 
consumes  the  last  bits  to  achieve  a  unique  output 
address  for  each  of  the  N  inputs.  In  this  mode,  the 
network  can  be  self-routing.  The  nonblocking  net¬ 
work  structure  ensures  that  no  permutation  can  re¬ 
sult  in  network  blocking  or  a  disallowed  switch  state. 
If  PS  and  AC  are  used,  then  the  network  controller 
must  work  in  reverse  by  use  of  sender-tag  routing, 
in  which  the  fan-in  unit  is  set  according  to  the  last 
log2(N  1)  bits  of  the  sender  address  and  the  2x2 
switching  stage  is  set  according  to  the  first  bit  of  the 
sender  tag.  In  addition,  the  top  half  of  the  inputs 
must  be  polarized  orthogonally  to  the  lower  half  of 
the  inputs  to  ensure  proper  operation  of  the  2x2 
BCGH  switches.  When  active  fan-out  and  active 
fan-in  modules  are  present,  both  destination-tag  and 
sender-tag  algorithms  are  used.  In  all  cases,  the 
path-hunt  algorithm  and  the  switch-setting  process 
may  be  performed  independently  and  in  parallel  for 
each  of  the  N  channels.  This  property  enables  the 
path  hunt  to  be  performed  in  0(log  N)  time  steps. 
Note  that  special  care  must  be  taken  to  ensure  that 
the  inputs  to  a  2  x  2  switch  in  the  switching  stage 
have  orthogonal  polarizations.  When  PS  is  used  to¬ 
gether  with  a  shuffle  interconnection  topology,  then 
one  method  of  ensuring  this  is  to  polarize  the  top  half 
of  the  inputs  orthogonal  to  the  lower  half.  If  AS  is 
used  with  an  optical  shuffle,  then  the  polarizations  of 
subsequent  outputs  alternate,  and  the  polarizations 
of  the  lower  half  of  the  inputs  mirror  the  upper  half. 
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(C)  (d) 

Fig.  6.  Possible  control  algorithms  for  BCGH>based  MIN’S:  ta)  centralized  control  with  global  swtching,  (b)  centralized  control  with 
direct  injection,  (c)  centralized  control  uith  packet  headers,  and  (d)  distributed  control  with  self-routing  headers. 


B.  Control  Algorithms 

In  a  BCGH-based  Stretch  network,  the  switch  can 
operate  in  a  transparent  circuit-switched  mode  and 
can  be  either  locally  or  globally  routed.  The  setup 
and  reconfiguration  times  depend  on  the  specific 
polarization-modulator  technology,  but  after  the  net¬ 
work  state  is  set  the  data-transmission  rate  is  limited 
only  by  signal  attenuation,  cross  talk,  and  factors 
external  to  the  multistage  switch,  i.e.,  the  transmit¬ 
ter  and  receiver  responses.  Communication  in¬ 
volves  three  phases:  (i)  circuit  establishment,  in 
which  end-to-end  circuits  are  set  up  before  transmis¬ 
sion  begins;  (ii)  data  transmission,  in  which  the  data- 
modulation  rate  is  decoupled  from  the 
reconfiguration  of  the  network;  and  (iii)  circuit  dis¬ 
connect.  The  switch  described  in  this  paper  is  of  the 
space-division  switching  type,  with  the  added  possi¬ 
bility  that  the  inputs  to  ^e  network  may  be  time- 
division  or  wavelength-division  multiplexed  (e.g., 
from  an  optical-fiber  bundle). 

A  key  concern  is  the  control  of  the  network*s 
switching  states.  Several  distinct  types  of  control 
algorithms  can  be  defined:  centralized  control  with 
global  switching,  centralized  control  with  direct  in¬ 
jection,  centralized  control  with  packet  headers,  and 
distributed  control  with  self-routing  packet  head- 
ers.2"  Figures  6(a)- 6(d)  show  each  approach  sche¬ 
matically.  In  centralized  control  with  global 
switching,  the  switches  in  each  layer  of  the  network 
are  linked  and  can  only  switch  as  a  unit.  The  num¬ 
ber  of  control  lines  is  greatly  reduced,  but  only  a 
single  arbitrary  interconnection  of  one  input  to  one 
output  can  be  made  at  a  time.  This  functionality  is 


useful  in  the  fan-out  (fan-in)  modules  for  which  a 
single  input  (output)  must  be  steered  to  a  select  chan¬ 
nel,  so  that  the  control  lines  of  the  1  X  2  (2  X  1) 
switches  in  a  column  may  be  tied  together.  This 
reduction  in  control  lines  potentially  increases 
second-order  cross  talk  through  the  network. 

The  second  method  is  to  use  centralized  control 
with  direct  injection.  In  this  scheme  the  routing  al¬ 
gorithm  is  calculated  at  a  central  controller  that  de¬ 
termines  switch  settings  and  accesses  the  switching 
elements  sequentially  or  in  a  semiparallel  fashion. 
For  controlling  large  networks  it  becomes  essential  to 
have  an  architecture,  such  as  the  Stretch  network, 
that  allows  path  hunts  to  be  performed  in  parallel. 

Another  approach  is  to  use  centralized  control  with 
packet  headers.  The  routing  algorithm  is  performed 
at  a  centralized  controller,  but  the  process  of  setting 
the  individual  switches  of  the  MIN  is  implemented  by 
use  of  packet  headers  that  propagate  through  the 
network.  This  approach  can  be  achieved  by  use  of 
^30  transistors  per  switch  and  can  be  achieved  by 
use  of  a  BCGH  coupled  with  smart-pixel  technology. 
The  routing  information  must  still  be  distributed  to 
the  first-stage  array,  but  not  to  the  arrays  at  each 
layer  of  the  network.  For  BCGH-based  networks 
using  this  form  of  control,  virtual  circuit  switching  is 
achieved  by  the  specification  of  a  dedicated  time  in¬ 
terval  for  the  packet  headers  with  control  informa¬ 
tion  to  propagate  through  the  network  and  set  up  the 
required  data  paths.  As  soon  as  the  switches  have 
been  set,  passive  transmission  (no  detection  and  re¬ 
broadcast)  at  high  data  rates  is  possible.  For  trans¬ 
parent  operation,  control  may  optionally  be  at  a 
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distinct  wavelength  or  on  a  separate  path.  The 
main  advantage  of  this  method  is  a  reduced  controller 
pin-out  (a  factor  of  at  least  log2  N  fewer  than  direct 
injection).  However,  this  approach  requires  syn¬ 
chronous  network  operation  and  a  technology"  that 
can  provide  some  intelligence  at  each  pixel  location. 

A  fourth  approach  is  to  use  distributed  control  with 
self-routing  packet  headers.  Here,  no  central  con¬ 
troller  is  needed.  This  algorithm  is  typically  associ¬ 
ated  with  fast  packet  switching.  A  smart-pixel 
implementation  will  require  50-75  transistors  per 
1x2  switch.2i  This  approach  uses  the  same  smart- 
pixel  hardware  and  interconnection  topology  as  the 
centralized  approach  with  packet  headers.  These 
packet-header  control  approaches  are  relevant  for 
only  BCGH  networks  that  use  active  fan-out  or  active 
fan-in  (or  both). 

C.  Comparison  of  Nonblocking  Networks 
Two  key  performance  metrics  for  a  passive  multi¬ 
stage  photonic  switch  architecture  are  its  optical  at¬ 
tenuation  and  its  SNR  or  cross  talk.  These  in  turn 
are  related  to  the  amount  of  optical  fan-out  per  input, 
the  number  of  stages  in  the  multistage-switching  net¬ 
work,  and  the  number  of  switches  an  input  signal 
must  traverse  to  reach  its  intended  output  (path 
length).  In  the  following  we  analyze  the  perfor¬ 
mance  of  the  nonblocking  Stretch  network  in  terms  of 
its  cross  talk  and  attenuation  relative  to  some  well- 
known  nonblocking  MIN  architectures  that  are  suit¬ 
able  for  implementation  with  the  2x2  BCGH 
switches.  Tliese  include  the  crossbar,  N-stage  pla¬ 
nar, Benes,29  dilated  Benes,^®  and  Batcher-Banyan 
networks.31  A  more  detailed  discussion  of  these 
switching-network  architectures  can  be  found  in 
Refs.  28-32. 

In  terms  of  nonblocking  architectures  for  photonic 
switching,  the  most  well-known  switch  architecture 
is  the  crossbar.  The  crossbar  (or  full  space-dmsion 
switch)  is  a  strictly  nonblocking  architecture  with 
switches  and  a  worst-case  path  length  of  2N  -  1.  In 
a  crossbar,  the  path  length  and  the  signal  skew  de¬ 
pend  on  the  specific  interconnection  path.  The 
N-stage  planar  network  is  a  rearrangeable  nonblock¬ 
ing  architecture  requiring  N  stages  with  N/2  switch¬ 
ing  elements  per  stage.  It  evolved  from  the 
crossbar,  providing  fewer  switches  and  a  planar  (no 
crossover)  architecture  by  use  of  only  the  nearest- 
neighbor  interconnection  and  equal  path  lengths. 
The  total  number  of  switches  is  N(N  -  l)/2,  and  the 
maximum  path  length  is  N. 

In  terms  of  nonblocking  multistage  architectures 
that  require  significantly  fewer  2x2  switches,  a 
widely  studied  architecture  is  the  Benes  network. 
The  rearrangeable  nonblocking  Benes  network  con¬ 
sists  of  two  log2  N  networks  placed  end  to  end.  The 
network  has  2  log2  N  -  I  stages,  which  is  the  theo¬ 
retical  minimum  number  of  stages  required  for  rear¬ 
rangeable  nonblocking  operation.  This  network 
provides  an  equal  path  length,  low  latency,  and  low 
switch  count  at  the  expense  of  a  more  complicated 
routing  overhead. 


The  dilated  Benes  architecture  was  a  modifica¬ 
tion  of  the  Benes  network  designed  to  remove  effects 
of  cross  talk  that  plague  architectures  with  a  log  N 
or  greater  depth.  This  is  a  Benes  network  that  has 
been  doubled  in  width  while  the  initial  number  of 
inputs  and  outputs  has  been  maintained.  Dilated 
Benes  networks  thus  have  2  log2  N  stages  and  N 
switches  per  stage.  The  network  has  the  unique 
advantage  that  no  switching  element  carries  more 
than  one  active  signal.  Hence,  first-order  cross 
talk  is  eliminated.  Optical  cross  talk  from  another 
channel  can  be  mixed  with  an  active  signal  only  by 
its  passing  through  two  nominally  off  switches. 
When  this  second-order  cross  talk  is  low,  the  net¬ 
work  can  achieve  a  large  SNR,  Finally,  the 
Batcher-Banyon  network  is  a  self-routing  network 
consisting  of  a  sorting  network  followed  by  a  rout¬ 
ing  network.  The  total  number  of  stages  is  equal 
to  (l/2)log2^  N  +  (3/2)log2  N. 

Table  1  contains  a  summary  of  the  attenuation, 
SNR,  number  of  stages,  and  total  number  of  switches 
for  each  of  the  nonblocking  architectures  described 
above  versus  the  network  size  N.  Table  2  similarly 
shows  the  scaling  of  the  SNR,  attenuation,  and 
switch  count  of  the  various  configurations  of  the 
Stretch  network.  The  results  depend  strongly  on 
the  t5q)e  of  fan-out  and  fan-in  modules  used.  For 
instance,  the  number  of  stages  in  a  Stretch  network 
depends  on  the  design  of  the  fan-out  (splitting)  and 
fan-in  (combining)  stages.  It  is  log  N  if  either  the 
splitting  or  the  combining  is  active  (AS/PC  or 
PS/AC),  2  log2  N  -  1  if  both  the  splitting  and  com¬ 
bining  stages  are  active  (AS/ AC);  or  1  if  no  active 
switching  is  used  (PS/PC  with  separate  receivers). 

As  a  result  of  SNR  degradation,  optical  fan-in  to  a 
common  detector  is  feasible  only  for  smaller  net¬ 
works.  The  first-order  cross-talk  isolation  of  the 
AS/AC  Stretch  network  is  equal  to  the  cross-talk 
isolation  of  a  single  switch,  independent  of  network 
size.  The  second-order  cross  talk  is  much  smaller  in 
magnitude  than  the  first-order  cross  talk  and  can  be 
neglected.  The  attenuation  and  the  SNR  for  3^  = 
30,  20  dB,  respectively,  are  graphed  in  Figs.  7-9. 
The  corresponding  performance  of  several  Stretch 
networks  is  shown  for  comparison.  The  dotted  cutoff 
lines  show  the  maximum  achievable  sizes  of  each 
architecture,  assuming  a  maximum  acceptable  atten¬ 
uation  to  be  30  db  (99.9%)  and  the  minimum  SNR  to 
be  11  dB  (corresponding  to  a  bit  error  rate  of  lO"®).^^ 
It  should  be  noted  that  the  SNR  can  be  increased  at 
the  price  of  increased  attenuation.  For  example,  if 
the  fan-out  in  the  Stretch  network  were  increased 
from  N/2  to  N,  there  would  be  no  first-order  cross  talk 
and  the  SNR  could  be  doubled.  In  this  case  second- 
order  cross  talk  must  be  accounted  for.  The  result¬ 
ing  network  would  be  identical  to  a  full  space-division 
switch, 2®  and  it  would  have  increased  attenuation 
and  would  also  require  more  switching  elements  (Ta¬ 
ble  2). 

It  is  evident  that  nonblocking  networks,  such  as 
crossbars  or  iV-stage  planar  networks,  are  not  well 
suited  to  large-scale  implementation  with  a  BCGH. 
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Table  1.  Performance  Scaling  for  Several  Well-Known  Photonic  Switch  Architectures  In  Terms  of  the  Network  Size  N 


Architecture® 

(N  inputs/ 

N  outputs) 

Attenuation* 

(dB) 

SNR' 

idB) 

Number  of 
Logical 
Switching 
Stages 

Number  of 

2X2  Switches" 

Crossbar 

(nonblocking) 

(ZV  -  1)Q, 

3,  -  10  logiolA'  -  D 

A^ 

N-stage  planar 
(nonblocking) 

Na, 

3,-  10  log, oN 

N 

AW  -  l)/2 

Batcher-Banyon 

[1/2  log2  .VilogaA*  -  1) 

3,  “  10  logio(number 

1/2  logjAMogz 

A72( number  of  switches) 

(nonblocking) 

T  log2  N]q, 

of  stages  1 

A"  -  1 )  -^  logo  A 

Benes 

(rearrangeable, 

nonblocking) 

(21og2N-  Da, 

3,  -  10  logio(2  logzN  -  1) 

21og2A~l 

A72(2  logsN-  1) 

Dilated  Benes 
(rearrangeable. 
nonblocking)' 

2(log2A0o, 

23,  -  10  logio(2  logjN  -  1) 

2 log2  N 

2Nlog2Ar 

®The  architecture  was  circuit  switched. 

*For  the  worst-case  optical  path  loss,  a,  is  the  insertion  loss  per  switch,  in  decibels. 
^For  the  worst-case  SNR.  p,  is  the  cross-talk  isolation  per  switch,  in  decibels. 

"In  the  entire  netw’ork. 

'The  SNR  in  this  case  is  due  to  second -order  cross  talk. 


In  terms  of  attenuation  limits,  the  Batcher-Banyan 
MIN  scales  up  to  256  I/O  ports.  The  Benes,  dilated 
Benes,  and  Stretch  networks  with  AS  scale  well  be¬ 
yond  N  =  1000,  making  them  good  choices  (Fig.  7). 
Power  losses  that  are  due  to  fan-out  limit  Stretch 
networks  with  PS  to  fewer  than  512  I/O  ports. 
When  the  cross-talk  isolation  of  a  switch  equals  30 
db,  all  these  networks  perform  well  in  terms  of  SNR. 
However,  when  3^  is  lowered  to  20  db,  the  cross  talk 
from  the  switches  along  the  routing  paths  severely 
limits  the  scalability  of  Benes  networks,  and  to  a 
lesser  extent  the  Stretch  networks  with  PS,  In  this 


case,  either  a  dilated  Benes  network  or  a  Stretch 
network  with  AS  must  be  used  to  counter  the  delete¬ 
rious  effects  of  cross  talk  (Figs.  8  and  9). 

The  conclusion  is  that  the  nonblocking  Stretch  net¬ 
work  with  AC  is  a  suitable  candidate  for  a  BCGH- 
based  switch  implementation  and  has  good  potential 
for  extension  to  large  networks.  The  main  advan¬ 
tages  of  the  Stretch  network  over  other  suitable  mul¬ 
tistage  networks,  such  as  the  dilated  Benes,  are  its 
nonblocking  operation  without  the  need  for  rear¬ 
rangement  and  its  simple,  parallel  path-hunt  capa¬ 
bility.  Among  the  nonblocking  networks,  it  has  the 


Table  2.  Performance  Scaling  for  Various  Configurations  of  the  Stretch  Network  versus  the  Network  Size  N* 


Architecture* 

(N  Inputs /N  Outputs) 

Attenuation' 

(dB) 

SNR" 

(dB) 

Number  of 
Logical 
Switching 
Stages 

Switches 

Number' 

Type 

Stretch  AS/AC  tied 
control  lines 

(2  Logj  N  -  1  )Q, 

3. 

aiogaA-  1 

NiN  -  2) 

+  (NV4) 

1X2 

2x2 

Stretch  AS/AC  sepa¬ 
rate  control  lines 

(2log2N-  l)a^ 

P. 

21og2A-  1 

N(N  -  2) 

+  (N^/4) 

1x2 

2x2 

Stretch  PS/AC  tied 
control  lines 

(Iog2  Af)a^  +  3(log2N  -  1) 

P,  -  10  logjodogjN  -  1) 

logjN 

N(N/2  -  1) 

+  (JVV4) 

1x2 

2X2 

Stretch  AS/PC  tied 
control  lines;  fan-in 

(loga  ATla,, 

P.  -  10  log^o(N/2) 

logj.V 

MN/2  -  1) 

+  (NV4) 

1  X  2 
2x2 

to  receivers 

Stretch  PS/PC 

+  adogoN  -  1) 

3. 

1 

(NV4) 

2x2 

separate  receivers 

Full  space-division 
switch  AS/AC 

(2  loga  N)a, 

23.S  “  10  logjudogo  A'’) 
(second-order  cross  talk) 

2  loga  A- 

2N\N  -  1) 

1X2 

fan-out  equals  N 

“Either  AS  or  PS  may  be  used  in  the  fan-out  module.  For  small  networks,  the  fan-in  modules  mav  use  PC  with  either  optical  fan-in 
or  separate  receivers.  For  large  networks,  the  fan-in  modules  should  use  AC.  The  control  signals  in  a  stage  of  an  active  splittin*T 
ifan-out)  or  an  active  combining  (fan-in)  module  can  be  tied  together  to  reduce  the  number  of  separate  control  lines.  " 

“The  architecture  is  circuit  switched. 

‘  For  the  worst-case  optical  path  loss,  is  the  insertion  loss  per  switch,  in  decibels. 

‘'For  the  worst -case  SNR.  3.  is  the  cros.s-talk  isolation  per  switch,  in  dc*cibt‘l.s.  SNR  is  limited  by  first-order  cross  talk  unless  otherwise  noted 
‘‘Number  in  the  entire  network. 
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s  20dB) 


(05  =  0.64  dB) 


Fig.  7.  Network  attenuation  a,  (in  decibels)  versus  the  number  of 
input  ports  N  for  BCGH-based  MIN’s  with  the  assumption  of  a  2  x 
2  switch  insertion  loss  of  0.64  dB.  X-Bar,  crossbar. 


fewest  number  of  optical  stages.  This  reduces  the 
network  delay  and  control  complexity.  The  penal¬ 
ties  are  its  increased  attenuation  when  PS  is  used 
and  its  larger  number  of  switches  when  AS  is  used. 
For  switching  networks  with  less  than  512  I/O  ports, 
PS  can  be  used;  for  larger  networks,  AS  may  become 
necessary.  Technological  limits  to  the  BCGH 
Stretch  network  with  AS  arise  primarily  from  array- 
size  limits  of  the  pixellated  polarization  modulator 
(iV^/4  pixels  needed)  and  the  maximum  size  of  the 
birefringent  hologram. 

4.  4x4  Switch  Demonstration 

BCGH  components  have  previously  been  evaluated 
for  a  1  X  2  switch^®  and  a  2  X  2  switch.^®-®®  Here  we 
describe  the  first  multistage  switching  network  dem¬ 
onstration  based  on  cascaded  arrays  of  polarization- 
selective  holographic  components.  We  have 
designed  and  implemented  a  three-stage,  4x4 
BCGH  optical  multistage  switch  that  can  be  scaled  to 
larger  sizes.  The  4X4  BCGH-Stretch  network 
used  centralized  control  with  global  switching  for  the 
fan-out  module  and  centralized  control  with  direct 
injection  for  the  2  x  2  switching  stage.  Figure  10 


Fig.  8.  SNR  (B,  (in  decibels)  versus  the  number  of  input  ports  N  for 
BCGH-based  MIN’s  with  the  assumption  of  a  2  x  2  switch  SNR  of 
30  db.  X-Bar,  crossbar. 


Fig.  9.  Network  SNR  P,  (in  decibels)  versus  the  number  of  input 
ports  N  for  BCGH-based  MIN  s  with  the  assumption  of  a  2  x  2 
svitch  SNR  of  20  dB,  X-Bar,  crossbar. 


shows  a  schematic  diagram  of  the  network  architec¬ 
ture,  and  Fig.  11  shows  a  schematic  diagram  of  the 
experimental  setup.  The  switch  consists  of  three 
cascaded  BCGH  switch  arrays,  two  PR’s,  and  four 
photodetectors.  The  fan-out  (splitting)  stage  of  the 
network  is  an  array  of  1  x  2  switches,  and  it  consists 
of  BCGHl  and  a  PR.  The  second  stage  of  the  net¬ 
work  is  an  array  of  2  x  2  switches.  It  is  constructed 
by  use  of  BCGH2  and  BCGH3  together  with  a  PR. 
PC  of  the  beams  occurs  on  four  photodetectors  (one 
for  each  output  channel)  that  serve  as  the  output 
stage  of  the  network.  In  general,  passive  optical 
fan-in  is  feasible  only  for  smaller  networks  (see  Table 
2).  The  SNR  can  be  increased  by  use  of  a  stage  of 
active  fan-in  modules.  Note  that  the  4X4  imple¬ 
mentation  used  a  butterfl3rtype  interconnect,  instead 
of  the  shuffle.  This  permitted  smaller  deflection  an¬ 
gles  and  allowed  a  single  type  of  element  to  be  used 
when  the  network  was  folded  into  two  dimensions. 

The  three  BCGH  arrays  were  identical  four-level 
phase,  polarization-selective  diffractive  elements,  as 
shown  schematically  in  Fig.  12.  The  BCGH  switch¬ 
ing  elements  were  fabricated  by  use  of  standard 
microfabrication  technologies:  electron-beam  litho¬ 
graphy  was  used  to  deflne  the  mask  patterns;  optical 
lithography  was  employed  to  transfer  the  pattern 
onto  the  Y-cut  lithium  niobate  substrate;  the  surface- 
relief  profile  was  obtained  through  the  use  of  ion- 
beam  etching.  Each  BCGH  array  consisted  of  a  4  x 
4  array  of  pixels,  where  each  pixel  corresponds  to  a 
1x2  switch.  The  dimensions  of  each  pixel  were 
approximately  4  mm  x  4  mm,  so  the  overall  element 
had  an  active  area  of  1.6  cm  x  1.6  cm  (Fig.  13). 

Each  of  the  16  switches  in  an  array  was  a  diffrac¬ 
tive  polarization  beam  splitter  designed  to  propagate 
vertically  polarized  light  (solid  lines  in  Fig.  11) 
straight  and  to  deflect  horizontally  polarized  light 
(dashed  lines)  at  an  angle.  The  grating  period  was 
40  fim,  and  the  operating  wavelength  was  514.5  nm. 
These  parameters  set  the  optimum  distance  between 
the  two  BCGH  arrays  to  be  approximately  320  mm. 
Figure  14  shows  a  photograph  of  the  system  that 
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Fig.  10.  Schematic  diagram  of  a  4  x  4  BCGH  Stretch  switch. 

folded  version  of  the  optical  2-shuffle  interconnection.  Scope,  oscilloscope,  V-pol.,  vertically  poianz 
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displays  the  cascaded  BCGH  arrays.  A  cdlimated 
laser  beam  was  used  in  the  experiment,  without  any 
relay  imaging  optics  between  the  BCGH  holo^am^ 
The  measured  diffraction  efficiencies  of  the  Butrii 


4miTi  tiiiiii 

Fig.  12.  Schematic  cross  section  of  the  BCGH  used  for  th^e  MIN 
studied  here.  The  BCGH  was  fabricated  in  LiNbOj.  The  sub¬ 
strate  thickness  was  1  mm,  and  the  operating  wavelength  was 
514.5  mm. 


holograms  were  approximately  26%,  with  peak  polar- 
ization  contrast  ratios  of  130:1. 

Two  broadband  manual  PR’s  and  two  electrically 
controlled  liquid-crystal  polarization  rotators 
(LQpIt’s)  were  used  to  characterize  the  switching  net¬ 
work  and  to  demonstrate  the  reconfiguration  of  ^e 
network,  respectively.  The  contrast  ratio 
(Newport,  Model  PR-550)  and  the  Hughes  LCPRs 
were  1000:1  (30  dB)  and  4:1  (6  dB),  respectively.  A 
beam  from  an  Ar^  laser  was  split  into  two  paths,  and 
two  mechanical  beam  choppers  modulated  at  300  and 
900  Hz  were  used  to  label  the  two  input  beams.  To 
characterize  the  performance  of  the  network,  the 
SNR  was  measured  at  each  output  of  the  network  by 
the  intensity  ratio  between  the  on  state  (one  of  the 
two  input  signals  is  routed  to  this  output  node)  and 
the  OFF  state  (both  inputs  are  routed  to  other  output 
nodes)  The  worst-case  SNR  at  the  output  was  mea¬ 
sured  to  be  10:1  (10  dB)  when  both  PR’s  were  manual; 
the  best-case  SNR  was  20:1  (13  dB).  During  normal 
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Fig.  13.  Photograph  of  the  4x4  BCGH  element  array.  The 
dimensions  of  the  array  are  16  mm  x  16  mm. 


network  operation,  LCPR’s  were  used  to  provide 
trically  controlled  reconfiguration.  The  4X4  svntch 
was  tested  with  one  and  two  active  inputs.  Fi^re 
15  demonstrates  one  active  input  to  the  4  x  4  switch 
being  switched  (reconfigured)  between  the  foui 
puts.  Figure  16  shows  the  output  data  with  each 
input  beam  being  switched  between  two  correspond¬ 
ing  outputs  of  the  network.  In  both  cases,  the  worst- 
case  SNR  was  approximately  4:1,  limited  by  the 
contrast  ratio  of  the  LCPR’s. 

The  reconfiguration  speed  of  the  network  was  also 
limited  by  the  temporal  response  of  the  PR’s.  The 
LCPR  was  operated  at  a  maximum  reconfiguration 
rate  of  approximately  1  KHz.  Using  PLZT  dead 
lanthanum  zirconate  titanate)  or  multiple-quantum- 
well  PR’s  may  make  reconfiguration  as  fast  as  10- 
100  MHz  possible.  Once  these  PR’s  were  set  at  a 
specific  configuration,  the  data  rate  was  limited  by 


the  frequency  responses  of  the  transmitters  and  re¬ 
ceivers,  because  no  signal  regeneration  was  used  in¬ 
side  the  multistage  switch.  To  demonstrate  this,  we 
used  an  acousto-optic  modulator  modulated  at  20 
MHz  by  a  pseudorandom  non-return-to-zero  data 
generator  to  modulate  one  of  the  input  signals.  The 
eye  diagram  obtained  at  one  of  the  outputs  of  the 
three-stage  multistage  interconnection  network  is 
shown  in  Fig.  17.  Table  3  lists  the  performance  val¬ 
ues  required  of  a  2  X  2  BCGH  switch  for  large 
switches  {N  s  1024)  and  the  best  experimental  re¬ 
sults  obtained  to  date.  Note  that  the  increased  cross 
talk  of  approximately  3  dB  for  the  experimental  4  X 
4  Stretch  switch  versus  that  of  the  2  X  2  swtch  is 
consistent  with  the  predicted  values  from  Table  2 
(AS/PC). 


5.  Summary 

This  paper  describes  the  design  and  implementation 
of  a  nonblocking  space-division  three-dimensional 
photonic  multistage  network  architecture  that  uses 
2X2  BCGH  polarization-selective  switches  to  s^vltch 
and  route  the  light  at  each  node.  The  switch  archi¬ 
tecture  uses  a  fan-out  stage,  a  single  stage  of  2  X  2 
switches,  and  a  fan-in  stage.  This  architecture  is 
well  suited  for  parallel  optical  implementation  in  that 
(a)  it  is  nonblocking;  (b)  it  enables  fast,  parallel  pat 
hunting  with  low  latency  communication;  (c)  it  uses 
simple  2-shuffle  and  iV/2-shuffle  connection  Patterns 
(or  their  equivalents);  (d)  it  uses  one  stage  of /"> 
2x2  switching  elements;  and  (e)  it  reduces  the  ef¬ 
fects  of  first-  and  second-order  cross  talk.  The  fan¬ 
out  stage  may  be  passive  (i.e.,  simple  opti^l 
broadcasting),  which  results  in  a  2/N  power  effi¬ 
ciency,  or  can  incorporate  N  fan-out  modules  (one  per 
input  port),  where  each  fan-out  module  uses  a  tree- 
based  architecture  withiV  —  1, 1  x  2  switches.  Sim¬ 
ilarly,  the  fan-in  stage  can  either  be  active  or  can  use 
N/2  separate  detectors  per  fan-in  module.  The  re- 


m: 

Fiff  14  PhotORraph  of  three  cascaded  BCGH  arrays  that  formed  the  core  of  the  demonstration  free-space  switc  . 
uled  i  ch?raSe  the  network  nnd  was  rephuod  by  an  electrically  controlled  LCPR  lor  last  reconhgurat.on. 


The  manual  PR  was 
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(C)  (d) 

Fig.  15.  Four-trace  oscilloscope  photographs  showing  the  outputs  of  the  4  X  4  switch.  An  input  to  the  4  X  4  swntch  is  being  swntched 
(reconfigured!  between  lal  outputs  1  and  2,  (b)  outputs  2  and  3,  id  outputs  3  and  4,  and  (d)  outputs  1  and  4.  Network  cross  talk  was  limited 
by  the  4:1  contrast  ratio  of  the  polarization  rotator.  A  SNR  of  13  db  was  measured  with  manual  PR’s.  The  honzontal  sweep  rate  is  10 


ms/division. 


Fig.*  16.  Four-lrat-c  oscillo.'^copc  photographs  showing  two  active 
inputs  to  the  4  ■'  4  switch  iK-ing  simultaneously  switched  hetween 
two  outputs  or  the  network.  'Phe  horizontal  sweep  rate  is  10 
ms/division. 


suiting  photonic  switch  is  circuit  switched  and  can  be 
either  locally  or  externally  controlled.  The  control 
choice  will  drive  the  required  light-modulator  tech¬ 
nology  and  required  pixel  complexity.  Network 
setup  and  reconfiguration  times  depend  on  the  spe¬ 
cific  polarization-modulator  technology,  but  after  the 
polarization  switches  are  set,  the  switching  network 
is  transparent,  and  the  data-transmission  rate  is  lim¬ 
ited  by  the  source  and  receiver  response. 

A  small-scale  network  was  demonstrated  in  an  ex¬ 
perimental  4x4  BCGH  switch.  The  use  of  a  high- 
performance  pixellated  polarization-modulator 
array,  together  with  ongoing  research  on  improving 
the  performance  of  the  BCGH  elements,  could  make 
such  a  switch  (with  N  >  32  ports)  a  useful  candidate 
for  high-speed  optical  networks,  as  well  as  for  large- 
scale  optical  multiprocessor  interconnection  net¬ 
works. 

F.  Xu  and  Y.  Fainman  acknowledge  partial  support 
from  Rome  Laboratories  and  the  National  Science 
Foundation. 
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Abstract 

An  accurate  comparison  between  blocking  and 
non-blocking  interconnection  networks  requires 
an  evaluation  of  the  additional  overhead  imposed 
on  the  system  by  the  resubmission  of  blocked  re¬ 
quests.  By  examining  a  dominant  system  which 
bounds  the  performance  of  the  actual  system,  we 
formulate  and  solve  a  queuing  model  for  a  re¬ 
alistic  processor/memory  interconnection  system 
using  the  gated-hold  protocol.  Additionally,  be¬ 
cause  the  solution  of  this  model  requires  moderate 
computing  power,  an  approximate  solution  that 
agrees  closely  with  the  exact  analysis  and  that  in¬ 
corporates  non-uniform  input  and  output  distri¬ 
butions  is  described.  The  approximate  solution  is 
validated  with  simulations  for  the  case  of  uniform 
input  and  output  distributions.  Results  indicate 
that  delay  increases  approximately  as  y/N  instead 
of  log(A'^)  making  this  protcol  ill  suited  to  large 
networks. 

1  Introduction 

Multistage  Interconnection  Networks  (MINs)  have  been 
utilized  in  Parallel  Processing  Systems  to  facilitate  connec¬ 
tion  between  processors  and  among  processors  and  mem¬ 
ories.  A  survey  of  a  number  of  MIN  architectures  may  be 
found  in  [6].  Of  particular  interest  are  blocking  and  non- 
blocking  networks.  Of  the  two,  non-blocking  networks, 
although  more  costly  to  build,  offer  full  connectivity  be¬ 
tween  any  free  input  and  output  regardless  of  the  traffic 
pattern.  On  the  other  band,  blocking  networks  are  cheaper 
to  build,  but,  depending  on  the  traffic  pattern,  messages 
may  be  blocked  due  to  contention  within  the  network.  It  is 
natural  to  enhance  the  functionality  of  blocking  networks 
to  ensure  message  delivery  before  comparing  them  with 
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non-blocking  architectures. 

One  possible  solution  is  to  resubmit  blocked  messages 
until  eventual  delivery.  Because  requests  may  have  to  be 
submitted  several  times,  each  request  will  experience  a  ran¬ 
dom  delay  depending  on  the  prevailing  traffic  pattern.  The 
precise  overhead  incurred  by  this  process  of  resubmissions 
is  not  easily  determined  because  of  the  stochastic  nature 
of  the  blocking.  Dietrich  and  Rao  [11]  [12]  analyzed  a 
synchronous,  circuit-switched  square  Banyan  network  of 
2x2  crossbar  switches  implementing  a  gated-hold  pro¬ 
tocol  in  which  partial  path  information  is  retained  to  speed 
circuit  set-up.  In  this  protocol,  once  a  set  of  request  enters 
the  network,  the  network  is  closed  (gated)  and  no  new  re¬ 
quests  enter  until  all  current  requests  have  been  serviced. 
In  [11],  the  authors  derived  the  mean  time  to  fully  service 
a  batch  of  uniform  and  independent  requests  appearing  at 
the  input  to  the  network.  Subsequently,  in  [12],  the  authors 
extend  this  analysis  to  incorporate  arbitrary  independent 
input  distributions  and  ‘hotspot’  output  distributions. 

In  a  realistic  processor-memory  interconnect,  the  num¬ 
ber  of  requests  present  at  the  beginning  of  a  cycle  is  likely 
to  be  proportional  to  the  length  of  the  previous  cycle.  To 
further  extend  our  understanding  of  a  processor-memory 
interconnect  system,  it  is  imperative  that  we  understand 
the  temporal  evolution  of  the  request  generation  process 
and  its  effect  on  the  response  of  the  system.  This  exten¬ 
sion,  which  has  not  been  studied  before,  is  the  subject  of 
this  paper.  In  this  paper,  we  model  the  temporal  evolu¬ 
tion  of  the  request  generation  process  by  assuming  that 
a  processor  can  have  only  a  small  number  B  of  memory 
access  requests  queued  at  a  time,  [7].  We  anlayze  the 
performance  of  a  circuit  switched  interconnection  network 
which  utilizes  the  gated-hold  protocol  and  is  subject  to 
such  an  arrival  process. 

1.1  Proposed  Protocol 

A  synchronous  circuit-switched  Delta  network  with  a 
holding  protocol  was  presented  in  [11]  in  the  context  of 
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a  system  that  allowed  for  no  queuing.  This  protocol  will 
now  be  briefly  described.  Assume  that  lime  is  divided 
into  periods.  At  the  beginning  of  each  period,  the  proces¬ 
sors  submit  their  requests  for  connection  to  the  memory 
devices.  As  these  requests  propagate  Uirough  the  multi¬ 
stage-interconnection  network,  some  are  blocked  and  oth¬ 
ers  progress.  The  requests  that  are  blocked  hold  their 
partial  paths.  After  the  requests  that  did  not  get  blocked 
are  serviced,  the  blocked  requests  continue  their  advance, 
starting  with  those  in  the  stage  of  the  MIN  closest  to  the 
outputs  of  the  network.  This  process  continues  until  all 
requests  have  been  serv^ed  at  which  time  a  new  period  be¬ 
gins  and  the  processors  submit  new  requests  that  may  have 
arrived.  Unlike  the  synchronous  circuit-switched  protocol 
with  request  resubmission  or  dropping,  these  service  peri¬ 
ods  are  not  of  equal  length,  and  depend  upon  the  number  of 
collisions  that  occur  during  contention  for  the  communica¬ 
tions  paths.  Such  a  service  period  will  be  called  a  network 
cycle. 

1.2  Request  Generation  and  Service  Model 

We  adopt  a  model  in  which  each  processor  generates  a 
request  Vr'ith  a  fixed  probability  in  each  fixed  length  wi/cft- 
ing  cycle.  Furthermore,  at  most  one  request  can  accumulate 
over  the  variable  length  network  cycle.  Thus,  longer  the 
network  cycle,  larger  will  be  the  number  of  accumulated 
requests.  Therefore,  we  are  able  to  capture  the  temporal 
evolution  of  the  request  generation  process  that  one  would 
expect  to  encounter  in  a  MIN  that  employs  a  blocked  ac¬ 
cess  scheme.  The  temporal  evolution  of  the  system  is  as 
follows. 

•  In  each  switching  cycle,  each  processor  generates  are- 
quest  that  is  uniformly  destined  to  any  of  the  memory 
modules  with  probability  9,*,  I  <  N  independent 
of  other  devices,  and  independent  of  any  requests  that 
are  currently  in  the  network.  Therefore,  it  is  possible 
for  a  processor  to  generate  a  new  request  even  while  it 
has  a  message  in  the  network  awaiting  transmission. 
However,  once  a  processor  has  generated  a  request  in 
a  network  cycle,  it  cannot  generate  additional  requests 
until  the  current  network  cycle  ends  and  the  pending 
request  enters  the  network  for  transmission.  Conse¬ 
quently.  at  most  one  memory  reference  request  can 
be  waiting  from  each  processor  at  the  beginning  of  a 
service  period.  In  fact,  at  the  end  of  a  service  period 
that  required  L  switching  cycles,  each  processor  has 
a  request  with  probability  1  -  (1  —  9,)^. 

•  At  the  beginning  of  a  network  cycle,  all  wailing  re¬ 
quests  enter  the  network  and  attempt  to  establish  con¬ 
nections.  The  network  uses  the  gated  hold  protocol 


described  earlier.  Therefore,  all  requests  that  enter 
in  a  network  cycle  will  be  serviced  before  its  com¬ 
pletion.  Consequendy,  each  network  cycle  requires 
a  random  number  of  switching  cycles  reflecting  the 
number  and  destinations  of  the  active  requests. 

•  If,  at  the  beginning  of  a  network  cycle,  no  requests 
are  present  at  the  input  to  the  network,  we  assume  that 
the  network  waits  one  switching  cycle  to  allow  new 
requests  to  be  generated  and  resumes  operation.  The 
idle  switching  cycle  does  not  waste  bandwidth  nor 
increase  delay  since,  in  this  model,  no  new  requests 
arrive  at  the  input  of  the  network  until  the  instant 
before  the  new  cycle  begins. 

We  measure  the  communication  delay  (D,  )  from  the 
instant  at  which  the  message  is  generated  by  the  processor 
at  node  i  to  the  instant  at  which  the  network  cycle  in  which 
that  requests  is  served  finishes.  The  delay  of  a  particular 
message  is,  in  most  cases,  significantly  less  than  A,  but 
this  measure  provides  us  with  a  conservative  estimate  on 
communication  delay. 

The  throughput  of  each  input  pon(Ti)  is  defined  to  be 
the  average  number  of  messages  that  are  serviced  at  node : 
in  Log{N)  switching  cycles.  This  aggregate  measure  will 
enable  us  to  compare  this  protocol  with  other  interconnec¬ 
tion  schemes. 


2  Analysis 

In  principle,  an  exact  Markovian  analysis  of  the  pro¬ 
posed  system  is  possible.  However,  the  number  of  system 
states  that  need  to  be  tracked  grows  exponentially  with  the 
number  of  inputs,  hence  this  approach  is  not  very  viable. 
Consequently,  we  adopted  a  different  approach.  Based  on 
the  observation  that  most  of  the  analytical  hurdles  follow 
from  the  stochastic  dependencies  within  the  system,  we 
formulated  a  worse  case  “dominating  system”  that  exhibits 
a  reduced  degree  of  dependency. 

A  simpler  structure  can  be  derived  if  we  consider  an 
addition  strategy  in  which  the  switch,  upon  seeing  a  single 
request  at  its  outputs,  produces  another  request  at  its  other 
output  with  a  certain  probability  irrespective  of  whether  or 
not  blocking  occurred  within  the  switch. 

More  specifically,  the  reduced  degree  of  coupling  in  the 
dominating  system  allows  us  to  derive  a  recursive  equation 
that  expresses  the  lime  to  service  a  set  of  requests  at  the 
input  to  j  stages  of  an  N  -input  network  in  terms  of  the  time 
to  service  a  set  of  requests  at  the  input  the  j  —  1  remaining 
stage  of  the  same  N -input  network. 
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2.1  Analytical  Technique 


behavior  of  the  chain.  The  quantity 


Define  a  random  variable  g^ip^j)  to  be  the  number 
of  switching  cycles  required  to  set  up  circuits  for  a  set  of 
requests  with  independent  and  uniform  distributions  with 
marginal  input  probability  p  (i.e.  the  total  time  required  to 
set  up  connections  for  all  input  devices  requiring  such  con¬ 
nections)  at  the  input  to  the  remaining  j  stages  of  a  square 
(N  input)  Banyan  network.  A  switching  cycle  is  defined 
as  the  time  taken  for  a  request  to  propagate  through  one 
stage  of  the  switching  network.  Throughout  this  analysis, 
a  switching  cycle  is  assumed  to  take  one  unit  of  time. 

In  [11],  a  recursive  equation  for  the  mean  lime  to  service 
a  batch  of  uniform  requests  in  the  dominating  system  is  de¬ 
rived.  This  result  was  later  extended  in  [12]  to  incorporate 
non-uniform  input  and  output  probabilities.  Using  similar 
techniques,  in  [13]  a  recursive  equation  for  the  conditional 
probability  mass  function  was  derived  for  uniform  input 
probabilities  and  uniform  destination  disuibutions.  In  the 
next  section,  we  utilize  the  result  of  the  analyses  summa¬ 
rized  in  Section  2.1,  to  study  the  dynamic  evolution  of  the 
system  as  successive  batches  of  requests  are  generated  and 
held  in  a  queue  until  they  are  serviced. 

2.2  Queueing  Analysis  of  a  Processor/Memory 
Interconnect 

Define  the  random  variable  Sn  to  be  the  length  in  switch¬ 
ing  cycles  of  the  network  cycle.  The  length  of  the 
cycle  is  determined  by  the  number  of  requests  present  at 
its  beginning.  This  length  in  turn,  is  determined  by  the 
length  of  the  previous  c>xle.  Formally,  we  can  show  that 
is  a  Markov  Chain.  The  Markov  Chain  S  is 
aperiodic  and  irreducible.  It  has  a  finite  state  space  and 
consequently,  is  ergodic  and  possesses  a  stationary  distri¬ 
bution  Ttj.  To  find  the  stationary  statistics  of  this  chain, 
we  need  to  compute  the  transition  probabilities  .  These 
can  be  obtained  from  the  recursive  solution  for  the  prob¬ 
ability  mass  function  presented  in  [13]  for  uniform  input 
and  routing  probabilities  only.  Although  they  are  a  bit  te¬ 
dious  to  compute,  these  transition  probabilities  are  easily 
determined  for  small  networks.  When  the  network  size 
increases  the  exact  computation  becomes  harder  to  per¬ 
form  and  alternative  approaches  must  be  explored.  Further 
details  of  the  exact  analysis  are  omitted  for  brevity. 

2.2.1  Drift  Analysis 

Because  the  transition  probabilities  of  the  Markov  Chain 
S  are  difficult  to  compute  for  large  networks,  we  consider 
another  technique  that  yields  the  approximate  steady  state 


/(i)  =  E[Sn  -  =  0 

is  defined  as  the  drift  of  the  chain  5.  If  this  function 
is  concave  >  0),  the  zero  of  this  function  upper 

bounds  the  steady  state  average  of  the  chain  S. 

To  show  this,  we  utilize  a  result  from  [3].  Since  the 
chain  S  is  ergodic  and  has  a  finite  state  space,  we  have 
from  [3] 

E  =  j]]  =  0  (1) 

If  /(t)  is  concave,  then  by  1  and  Jesnen’s  inequality, 

0  =  Ei[E[Sn-S„-l\Sn-X  =  i]] 

<  E[S„  -  S„.x\S„.i  =  E[S]]  (2) 

Furthermore,  if  the  function  /(i),  is  such  that 

/(*■)  >0  Vi</ 

<0  Vi>i‘ 

then  from  (2)  we  have 

j'  >  E[S\  (3) 

The  drift  can  be  expressed  in  terms  of  the  recursive 
equation  presented  in  [12]  as 

/({)  =  E[S„-Sn-l\S„-l  =  i] 

=  -  (1  -  Q)\cc.k)  =  j] 

+  l(Pr[9^{l-{l-g)\<x,k)=l] 

+  Pr[<,'"(l-(l-9)’'.a.;:)  =  0])-» 

=  E[g'^{l-  {I-  gy,a,k)] 

where  the  vector  operation  are  taken  elementwise. 

Because  we  do  not  have  a  analytic  form  for  the  function 
/(i),  it  would  be  a  difficult  exercise  to  formally  prove 
the  two  properties  required  to  show  that  this  is  a  bound. 
However,  based  on  graphical  evidence  we  can  then  show 


that 

,  ^ -  —  G  - 

E[W{]  > 

j'  1 

1  -  (1  -  fliV  "  9>' 

and 

E[Ti]  ~ 

using  Jensen’s  inequality  as  a  bound  and  approximation 
respectively. 
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For  comparison,  we  computed  the  exact  value  of  the 
steady  state  average  cycle  lime  and  compared  it  to  that 
found  by  this  approximate  bounding  techniques  for  uni¬ 
form  input  and  output  distributions.  The  closeness  of  the 
curves  suggest  that  the  approximation,  which  is  signifi¬ 
cantly  easier  to  compute,  provides  an  excellent  measure  of 
steady  state  system  performance. 

3  Results 

For  small  networks,  thfe  exact  system  analysis,  omitted 
here  for  brevity,  and  the  approximate  analysis  of  Section 
2.2.1  are  very  close.  Figures  1  and  2  show  that  the  two 
cun’cs  are  almost  indistinguishable.  Because  of  the  diffi¬ 
culties  in  computing  the  exact  analysis  for  large  networks, 
we  display  only  those  results  from  the  approximate  analy¬ 
sis  for  large  values  of  N  in  the  succeeding  figures. 

Since  theses  MINs  have  Log[N)  stages,  we  can  ex¬ 
pect  the  delay  to  grow  with  network  size  at  least  as  fast 
as  Log{N).  Although  we  have  not  found  an  analytic 
asymptotic  expression  for  £[g^’(l,i)],  we  can  compute 
£[5^(1,/:)]  for  exuemely  large  values  of  N  {N  >  2^°). 
Figure  6  shows  these  curves  to  be  proportional  to  0{\/N). 
For  large  values  of  this  quandt^hence  the  total  com¬ 
munications  delay,  grows  like  0(n/^)  (Fig.  3).  Itappears 
that  for  relatively  small  values  of  N,  the  maximum  delay 
grows  like  0[Log{N)^/N .  Additionally,  as  is  evident 
from  Fig.  4,  the  throughput  of  the  network  diminishes  as 
order  -/N. 

The  delay/ihrougbput  performance  of  several  small  size 
networks  is  shown  in  Fig.  5.  We  see  delay  inae^ing 
as  -/N  and  throughput  decreasing  as  \/N.  As  such,  this 
protocol  may  be  be  poorly  suited  for  use  in  extremely  large 
interconnection  networks.  The  importance  of  tracking  both 
the  delay  and  the  throughput  is  evident  from  this  figure.  In 
the  absence  of  such  an  analysis,  one  obtains  figures  for  the 
maximum  throughput  not  recognizing  that  sometimes  op¬ 
erating  a  little  below  the  maximum  can  result  in  significant 
reduction  in  communication  delay. 

4  Conclusions 

In  this  paper,  a  circuit-switched  multistage  proces¬ 
sor/memory  interconnection  system  was  studied  under  a 
dynamic  request  generation  model.  The  temporal  evo¬ 
lution  of  the  request  generation  process  was  modelled  by 
assuming  that  the  likelihood  of  a  processor  generating  a  re¬ 
quest  was  a  function  of  the  length  of  the  previous  network 
cycle.  The  network  utilized  a  gated-bold  protocol  in  which 
partial  path  information  was  retained  to  speed  circuit  set¬ 
up  lime.  Approximate  and  exact  bounds  for  system  delay 


and  throughput  were  derived.  Quantitative  results  suggest 
that  the  delay  grows  as  0{\/N)  when  the  network  size  be¬ 
comes  large.  The  derivation  of  an  asymptotic  expression 
for  £[/^(l,  it)]  as  N  grows  large  would  identify  asymp¬ 
totic  delay  and  throughput  expressions.  However,  since 
the  maximum  delay  and  throughput  can  be  computed  from 
it)],  we  can  compute  delay  and  throughpuibound 
for  all  manageable  size  networks  yv  «  1  x  10®.  Poor  per¬ 
formance  at  very  large  network  size  limits  the  use  of  this 
protocol  to  small  networks. 

The  analysis  presented  in  this  paper  provides  a  way 
of  computing  figures  for  the  maximum  throughput  and 
communication  delay.  Knowledge  of  the  delay  throughput 
curve  makes  it  possible  to  chose  an  operating  point  below 
the  maximum  throughput  point  to  reduce  communication 
delay.  Furthermore,  the  bounding  methodology  used  in 
this  paper  is  evidence  of  the  usefulness  of  the  technique  in 
studying  the  stochastic  evolution  of  these  networks. 


Figure  1:  Actual  steady  state  cycle  time  versus  approxi¬ 
mate  steady  state,  //  =  8. 
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Figure  4:  Approximate  steady  sUte  throughputnormalized 
by  versus  requests  arrival  probability  q. 
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Figure  6;  Actual  values  of  (1 ,  k)]  for  large  k  versus 
asymptotic  approximation. 
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Request  Resubmission  in  a  Blocking,  Circuit-Switched, 

Interconnection  Network 


Paul  Dietrich  and  Ramesh  R.  Rao 


>lk,«roe«-In  this  paper,  we  study  the  delay  * 

of  a  circuit  switched,  self-routing  Delta  network 
all%Qu«ts  are  guaranteed  service.  A  gated  hold 
that  retains  partial  path  information,  is  used  to  guarantee 
«rvicrA  n^ovel  technique,, that  involves  the  co-struc  .on 
of  an  easier  to  analyie  dominant  system,  is  p  the 

recurciv“  expression  for  the  probability  mass  function  of  the 

cycle  time  in  the  dominant  system  is  ^thf^tual 

of  the  dominant  system  analysis  with  simulation  of  the 
sistm  thows  that  the  dominant  system 

for  increased  throughput. 

Kevwerdi—  Parallel  Processing,  Resubmission,  Non- 
Bloc^g,  Protocols,  Interconnection  Network,  Performance. 


I.  Introduction  < 

Multistage  Interconnection  Networks  (MINs)  have  been 
utilized  in  Parallel  Processing  Systems  to  facilitate  connec¬ 
tion  between  processors  and  among  processors  and  memor¬ 
ies  Over  the  years,  a  number  of  MIN  architectures  have 
been  proposed  and  analyzed.  A  representative  survey  inay 
be  found  in  [2].  In  order  to  select  between  the  many  exist- 
ine  MIN  architectures,  one  might  strive  to  characterize  all 
the  alternatives  in  terms  of  a  common  set  of  performance 
metrics.  In  this  regard,  at  times,  one  encounters  alternat¬ 
ives  that  are  not  easily  compared  because  they  differ  with 
respect  to  their  functionality. 

One  such  instance  occurs  when  attempting  to  compare 
blocking  and  non-blocking  networks.  Non-blocking  net¬ 
works.  although  more  costly  to  build,  offer  full  connectivity 
between  any  free  input  and  output  regardless  of  the  traffic 
pattern.  On  the  other  hand,  blocking  networks  are  less 
expensive  to  build,  but,  depending  on  the  traffic  pattern, 
messages  may  be  blocked  due  to  contention  ^ 

work  Clearly  we  must  enhance  the  functionality  of  blocking 
networks  to  guarantee  message  delivery  before  comparing 
them  with  non-blocking  architectures. 

One  possible  solution  is  to  resubmit  blocked  messages 
until  eventual  delivery.  Because  requests  may  have  to  be 
submitted  several  times,  each  request  will  experience  a  ro  - 
dom  delay  depending  on  the  traffic  pattern.  The  overhead 
incurred  by  this  process  of  resubmissions,  which  is  not  eas¬ 
ily  determined  because  of  the  stochastic  nature  of  the  block¬ 
ing,  is  the  object  of  this  study. 


Supported  by  NSF  under  gr^nt  NCR.8904029  and  by  Air  Force 
Rome  Laboratoriej  under  contract  F30602-95-Roo65 


-A.  Previous  Research 

In  his  pioneering  analysis,  Patel  [1]  ignored  the  resub¬ 
mission  of  blocked  requests  and  derived  an  expression  for 
the  bandwidth  of  a  Banyan  network.  Kruskal  and  Snir  s 
analysis  [3],  extends  Patel’s  work  and  provides  asymptotic 
call  blocking  results,  but  also  assumes  that  blocked  request 
are  not  resubmitted.  In  a  synchronous  circuit-switohed  net¬ 
work  with  light  traffic,  the  resubmission  of  blocked  requests 
is  likely  to  have  little  effect  on  system  performance;  requests 
are  seldom  blocked  and  therefore  do  not  cause  strong  tem¬ 
poral  traffic  correlations  at  the  network  input.  However, 
under  heavy  traffic  conditions,  it  is  likely  that  a  significant 
portion  of  requests  will  be  blocked  during  each  cycle  caus¬ 
ing  time  correlations  that  will  affect  perforrnance.  Und« 
these  traffic  conditions,  frequent  blocking  and  hence  resub¬ 
missions  will  occur  and  the  variability  oi  the  communication 
delay  may  increase  to  unacceptable  levels.  Because,  in  par¬ 
allel  processing  systems,  job  delay  is  often  deterinined  y 
the  slowest  processor,  it  is  desirable  to  have  an  intercon¬ 
nection  network  with  low  delay  variability.  It  is  therefom 

■  imperative  that  we  determine  the  precise  impact-of  resub- 

■  missions  on  system  performance.  ^  ^  .  v  j 

'  In  their  analysis  of  an  asynchronous  circuit  switched 

■  Banyan  network,  Wu  and  Lee  [6]  considered  the  effect  of 
1  resubmission  of  blocked  requests.  A  self-routing  Banyan 
=  network  was  studied  under  the  typical  assumption  that  re- 
-  quests  which  experience  contention  are  “regenerated  ran- 
"  domly  at  some  later  time  with  a  random  destination  di^ 

tribution.  This  result  was  contrasted  with  a  more  realistic 
e  blocking  model  in  which  requests  that  experience  conten- 
>  tion  continue  to  persist  in  the  netw^ork  (using  a  drop  or 
y  hold  strategy)  until  they  are  serviced.  Wu  and  Lees  res- 
ic  ults  show  that  the  “regenerative”  assumption  overestimates 
=s  performance  (by  as  much  as  30  percent  for  the  small  sized 
networks  tabulated). 

Wu  and  Lee  model  the  position  of  each  request  in  the 
’S  network  as  a  Markov  chain.  A  separate  chain  is  needed  to 
describe  the  state  of  each  request  in  the  network.  Natur¬ 
ally,  the  chains  of  all  the  requests  are  dependent,  as  locking 
ies  causes  dependence  among  the  requests.  However,  Wu  and 
be  Lee  assume  independence  among  the  chains,  and  decouple 
m-  the  chains  to  simplify  the  analysis.  Because  the  stoch^tic 
ad  dependencies  among  the  requests  are  ignored,  the  resulting 
as-  simplified  analysis  is  likely  to  be  optimistic.  One  can  there- 
ck-  fore  expect  that  even  their  results,  which  show  ignoring 
blocked  requests  gives  optimistic  results,  are  optimistic. 

Bhattacharya,  Rao,  and  Lin  [9]  recently  derived  an  u^ 
per  bound  on  message  delay  for  a  synchronous  circuit- 
switched  Delta  network  that  accommodates  the  resubmis- 
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sion  of  blocked  requests.  To  preserve  uniformity  and 
independence  of  the  message  input  distribution  they  as¬ 
sumed  that  blocked  requests  are  resubmitted  after  a  ran¬ 
dom  number  of  cycles  and  that  the  resubmitted  requests  re¬ 
randomize  their  destinations.  When  several  devices  require 
connection  through  the  same  switch  in  the  MIN,  the  re- 
randomization  of  the  blocked  requests  may  yield  perform¬ 
ance  results  that  are  optimistic  as  the  chance  of  repeated 
collisions  is  lowered. 

j5.  Motivation 

Recently,  Dietrich  and  Rao  [10]  analyzed  a  synchronous, 
circuit-s>\ntched  square  Banyan  network  of  2  x  2  cross¬ 
bar  switches  implementing  a  gated-hold  protocol  in  which 
partial  information  about  path  setup  is  retained.  In  this 
protocol,  once  a  set  of  requests  enters  the  network,  the 
network  is  closed  (gated)  and  no  new  requests  enter  un¬ 
til  all  current  requests  have  been  serviced.  This  protocol, 
like  the  hold  protocol  in  [6],  allows  requests  to  hold  their 
partial  paths  until  the  blocking  request  completes  its  ser¬ 
vice.  Unlike  the  asynchronous  hold  protocol,  the  protocol 
presented  in  [10]  is  gated.  Gating  the  network  may  re¬ 
duce  the  network  throughput,  but  guarantees  that  requests 
are  not  blocked  by  other  requests  that  may  arrive  at  the 
network  many  cycles  later.  In  [10],  the  authors  derived 
a  bound  for  the  mean  time  to  fully  service  a  batch  of  re¬ 
quests  appearing  at  the  input  to  the  network.  This  gated 
scheme  preserves  the  cycle  by  cycle  FCFS  property  of  the 
synchronous  circuit-sw^itched  network,  and  attempts  to  re¬ 
duce  the  delay  variability  by  preventing  the  mixing  of  new 
and  blocked  requests.  However,  the  gated-hold  strategy 
achieves  this  at  the  expense  of  network  utilization.  During 
the  end  of  the  network  cycle,  only  a  few  requests  are  left 
inside  the  network  utilizing  resources  that  could  be  shared 
with  the  requests  waiting  outside.  If  one  can  relax  the  gat¬ 
ing  and  allow  new  requests  to  enter  after  a  fixed  length  of 
time,  chosen  to  assure  that  only  a  small  number  of  requests 
remain  inside  the  network  w*hen  the  gate  is  opened,  one  can 
expect  an  increase  in  throughput  performance  at  a  minimal 
cost  in  delay  variability. 

It  is  clear  that  we  can  gain  valuable  insight  into  the  per¬ 
formance  of  such  a  scheme  by  examining  the  probability 
mass  function  for  the  time  to  completely  service  a  batch  of 
requests  under  the  gated  hold  protocol  introduced  in  [10]. 
The  probahiliiy  mass  function  provides  complete  informa¬ 
tion  about  the  delay  distribution,  and  as  such,  is  valuable  in 
determining  the  effect  of  prematurely  truncating  the  cycle 
to  increase  the  network  throughput.  With  this  informa¬ 
tion,  the  network  designer  can  better  control  the  mixing  of 
traffic  and  thus  the  delay  variability  and  FCFS  nature  of 
the  system. 

In  this  paper, w'e  derive  an  approximate  expression  for 
the  probability  mass  function  of  the  time  required  to  ser¬ 
vice  a  batch  of  independent  and  uniform  requests.  This 
technique  also  makes  it  possible  to  derive  simple  expres¬ 
sions  for  higher  moments  of  the  delay,  which  provides  a 
designer  with  direct  information  on  delay  variability. 

In  the  next  section,  we  describe  the  network  protocol, 


model,  and  the  underlying  assumptions  used  in  the  ana¬ 
lysis.  We  construct  the  dominant  system  in  Section  III. 

A  recursive  expression  for  the  probability  mass  function  of 
the  cycle  time  is  derived  in  Section  IV-C.  In  Section  V 
,  we  present  the  quantitative  results  for  comparison  pur¬ 
poses.  Finally,  discussion  and  conclusion  sections  interpret 
the  significance  of  these  results  and  suggest  ways  in  w»hich 
these  results  can  aid  in  designing  more  efficient  versions  of 
this  protocol. 

II.  Proposed  Protocol 

A.  Description 

A  synchronous  circuit-switched  Delta  network  w’ith  a 
holding  protocol,  briefly  presented  in  [10],  is  now  described 
in  some  detail.  Assume  that  time  is  divided  into  periods. 
At  the  beginning  of  each  period,  the  processors  submit  their 
requests  for  connection  to  the  memory  devices.  No  new  re¬ 
quests  may  be  submitted  until  all  these  requests  have  been 
served.  As  these  requests  propagate  through  the  multi¬ 
stage-interconnection  network,  some  are  blocked  and  others 
progress.  The  requests  that  are  blocked  hold  their  partial 
paths.  After  the  requests  that  did  not  get  blocked  are  ser¬ 
viced,  the  blocked  requests  continue  their  advance,  starting 
with  those  in  the  stage  of  the  MIN  closest  to  the  outputs 
of  the  network.  This  process  continues  until  all  requests 
have  been  served  at  which  time  a  new  period  begins  and 
the  processors  submit  new  requests  that  may  have  arrived. 
Unlike  the  synchronous  circuit-switched  protocol  with  re¬ 
quest  resubmission  or  dropping,  these  service  periods  are 
not  of  equal  length,  and  depend  upon  the  number  of  colli¬ 
sions  that  occur  during  contention  for  the  communications 
paths.  Such  a  service  period  will  be  called  a  network  cycle. 

j5.  Illustration 

As  an  example,  consider  the  set  of  active  users  repres¬ 
ented  by  dots  in  Fig  la.  After  one  switching  cycle,  no 
collisions  occur  and  all  requests  advance  to  the  next  stage 
(Fig.  lb).  During  the  second  switching  cycle,  some  of  the 
requests  require  connection  through  the  same  switch  out¬ 
puts  and  only  4  of  them  progress  (Fig.  Ic).  Of  those  that 
progress,  1  is  blocked  in  the  next  switching  cycle  and  the  re¬ 
maining  requests  transmit  their  messages  and  release  their 
circuits  requiring  d  switching  cycles  (Fig.  Id).  In  the  first 
switching  cycle  following  this,  the  blocked  request  in  the 
last  stage  advances  and  transmits  its  message  (Fig.  le). 
In  the  switching  cycle  following  this,  both  requests  in  the 
second  stage  advance  (Fig.  If).  Finally,  in  the  last  switch¬ 
ing  cycle,  theses  two  requests  set  up  a  circuit  and  transmit 
their  messages  (not  shown).  At  this  point  a  new  network 
cycle  begins  with  a  new  input  distribution. 

C.  Implementation 

To  asses  the  implementation  complexity  of  this  protocol, 
first  note  that  a  user  that  is  not  blocked  requires  no  informa¬ 
tion  beyond  maintaining  synchronization  with  the  network. 
When  a  user  becomes  blocked,  it  "holds”  the  circuit  and 
needs  to  determine  when  to  resume  progress.  This  inform- 
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Fig.  1.  A  Sample  path  of  MIN  implementing  the  gated  hold  protocol 


ation  can  be  derived  by  maintaining  a  counter  at  each  stage 
of  the  network.  The  counter  must  track  contentions  that 
may  occur  at  other  stages.  Additionally,  the  counter  must 
track  whenever  a  set  of  requests  (or  a  single  request)  finish 
transmission  and  release  their  circuits.  Because  contention 
can  occur  in  only  one  stage  during  any  switching  cycle, 
only  a  single  wired-or  signal  is  required  to  carry  contention 
information  to  all  stages.  Similarly,  because  there  will  be 
at  most  one  set  of  requests  that  release  their  circuits  dur¬ 
ing  a  cycle,  only  a  single  signal  is  required  to  carry  release 
information. 

One  implementation  of  the  gated  hold  protocol  is  now 
described.  If  a  request  is  blocked  at  a  certain  stage,  it 
triggers  a  counter  associated  with  the  stage.  This  counter 
is  initialized  to  1.  If,  during  any  switching  cycles  following 
this  event,  contention  occurs  in  a  stage  of  switches,  the 
counter  is  increased  by  1.  If,  during  any  of  the  switching 
cycles,  a  set  of^requests  release  their  circuits,  the  counter 
is  decreased  by  1.  When  the  stage  counter  reaches  0,  all 
requests  blocked  in  that  stage  are  signaled  to  advance. 

The  overhead  imposed  by  this  protocol  is  characterized 
by  the  hardware  complexity  required  to  implement  the 
counters  and  the  wired-or  control  lines.  The  up/down  coun¬ 
ters  in  an  N  stage  network  must  count  to  a  maximum  of 
log{N)  requiring  a  complexity  for  all  the  counters  of  order 
log{N)log[lo3[N)),  The  wired  or  signaling  is  required  at 
each  switch  of  the  network  growing  in  complexity  of  order 
N!og{N).  Thus,  the  additional  complexity  of  the  network 
grows  at  a  rate  less  than  or  equal  to  the  hardware  complex¬ 
ity  of  the  original  network. 


D.  Discussion 

To  better  understand  the  value  of  this  protocol  let  us 
contrast  it  with  protocols  that  allow  dropping.  The  loca¬ 
tion  and  extent  of  contention  within  a  switch  is  of  course 
a  function  of  the  request  generation  and  destination  selec¬ 
tion  process,  with  certain  request  patterns  inducing  more 
blocking  than  others.  A  switch  that  is  allowed  to  drop  a 
certain  average  fraction  of  its  traffic  (no  matter  how  small), 
win  operate  most  efficiently  when  it  selectively  drops  those 
patterns  that  are  most  contention  prone. 

Where  do  these  dropped  requests  go?  In  a  realistic  sys¬ 
tem,  blocked  requests  will  be  resubmitted  at  a  later  time  by 
a  "higher  layer”  entity  that  recognizes  the  blocking.  This 
in  turn  would  imply  an  occutnu/olion  over  time  of  the  most 
difficult  to  service  requests.  The  resulting  degradation  in 
performance  might  be  quite  undesirable.  In  fact,  a  study 
by  Heidelberger  and  Franaszek  [o]  of  a  circuit  switch  re¬ 
vealed  that  diverting  the  contention  prone  traffic  to  a  bypass 
network  resulted  in  uncharacteristically  rapid  saturation  of 
the  bypass  network.  Thus,  a  case  can  be  made  that,  in  the 
study  of  protocols  that  allow  dropping,  it  is  necessary  to 
characterize  the  relative  dependencies  of  the  set  of  dropped 
requests.  Failing  this,  one  has  an  incomplete  understanding 
of  the  performance  of  the  interconnection  network. 

There  is  another  loss  of  information  that  occurs  when 
requests  are  dropped.  Typically,  when  requests  collide, 
the  switches  allow  one  group  to  advance  and  hold  back 
the  remaining.  In  doing  so,  a  MIN  generates  the  inform¬ 
ation  necessary  to  partitfon  a  group  of  requests  into  non¬ 
colliding  groups  via  the  mechanism  of  contentions.  Drop¬ 
ping  partially  resolved  requests  and  mixing  them  with  other 
requests  destroys  this  potentially  useful  information  right 
after  incurring  the  performance  penalty. 

Against  this  backdrop,  two  key  aspects  of  our  protocol 
stand  out.  First,  by  guaranteeing  service  to  all  requests 
we  develop  an  understanding  of  the  complete  service  time. 
At  the  same  time  by  retaining  path  setup  and  contention 
information  we  preserve  and  exploit  available  information. 

III.  Construction  of  The  Dominant  System 

The  difficulty  in  the  analysis  of  the  cycle  time  is  primarily 
due  to  the  fact  that  even  if  requests  arriving  at  a  switch  are 
statistically  independent,  requests  departing  from  a  switch 
are  clearly  statistically  dependent.  This  is  a  consequence 
of  the  blocking  within  the  switch  and  implies  that  the  set  of 
requests  departing  from  the  first  stage  are  not  independent, 
making  the  arrival  process  to  the  following  stages  dependent 
across  all  switches. 

We  overcome  this  hurdle,  by  constructing  a  hypothetical 
a  system  in  which  the  forwarded  and  residual  requests  do 
not  have  a  complicated  interdependency.  This  makes  the 
hypothetical  system  more  tractable.  At  the  same  time,  as 
our  results  will  show,  the  hypothetical  system  captures  the 
essential  characteristics  of  the  actual  system.  The  basic 
idea  is  to  find  a  way  to  odd  ‘dummy’  requests  at  the  out¬ 
puts  of  the  switches  to  make  the  output  request  distribu¬ 
tions  uniform  ond  independent.  If  we  restrict  the  strategy 
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to  adding  requests  and  never  removing  them,  then  the  net¬ 
work  cycle  time  will  always  be  longer.  Thus  the  dominant 
system  would  provide  an  upper  bound  on  delay.  Similarly, 
throughput  (of  real  requests)  will  be  lower  for  every  sample 
path  and  the  dominant  system  will  provide  a  lower  bound 
to  the  system  throughput.  It  must  be  noted  that  in  the 
operation  of  the  true  system,  no  ‘dummy’  requests  need  be 
added  to  the  requests. 

The  dominant  system  is  not  designed  to  remove  all  de¬ 
pendence  among  requests.  Requests  within  the  dominant 
system  still  exhibit  some  dependence.  However,  the  dom¬ 
inant  system  is  designed  to  force  independence  among  the 
set  of  requests  at  the  output  of  a  stage.  Similarly,  the  dom¬ 
inant  system  is  designed  to  force  independence  among  the 
set  of  residual  requests  at  a  stage.  Jointly,  however,  the 
forwarded  and  residual  requests  are  not  independent. 

Clearly,  adding  requests  to  ensure  that  at  every  stage, 
every  input  has  a  request  would  make  requests  trivially 
independent  since  requests  would  be  present  with  probab¬ 
ility  one  independent  of  all  other  requests.  But,  such  an 
extreme  request  addition  strategy  would  not  provide  reas¬ 
onable  bounds  to  the  performance  of  the  actual  system.  It 
is  not  immediately  evident  if  there  exist  non-trivial  request 
addition  schemes  that  produces  the  desired  independence 
for  all  network  loads. 

In  this  section  we  develop  a  non-iriviol  request  addition 
strategy  that  yields  useful  results  for  all  values  of  the  input 
probability  p.  The  addition  strategy,  explicitly  specified 
in  Section  III-B,  requires  that  each  switch,  upon  seeing 
a  single  request  at  its  outputs,  adds  another  request  at 
its  empty  output  with  a  certain  probability  irrespective  of 
whether  or  not  blocking  occurred  within  the  switch. 

This  addition  strategy  was  specifically  chosen  to  be  inde¬ 
pendent  of  whether  contention  occurred  within  the  switch. 
An  alternative  request  addition  strategy  could  be  based 
upon  using  information  within  the  switch  regarding  colli¬ 
sions.  For  example,  a  request  could  be  added  on  certain 
ports  every  time  (with  probability  one)  contention  occurs. 
Such  request  addition  strategies  were  studied  and  found 
to  be  intractable  due  to  the  dependence  of  the  strategy  on 
collisions  and  are  not  considered  in  this  paper. 

A.  Notation 

•  Let  X  denote  a  vector  of  indicator  random  variables, 
each  of  whose  elements  take  the  value  1  with  prob¬ 
ability  p  independently  of  others.  This  vector  models 
the  presence  of  requests  at  the  inputs  to  the  k  stage 
network.  The  marginal  input  probability  p,  which  de¬ 
scribes  the  load  on  the  network  per  input  port  per 
cycle,  is  the  main  parameter  of  interest  and  is  explicitly 
represented  in  the  results. 

•  Let  Y  describe  the  vector  of  indicator  random  variables 
corresponding  to  those  requests  that  pass  out  of  the 
first  stage  after  one  cycle.  Because  we  have  assumed 
that  the  inputs  to  the  switches  are  independent,  Yj  will 
be  independent  ofY/f  for  all  combinations  of  outputs  k 


and  j  that  are  not  from  the  same  switch.  Specifically, 

PiYi  =  =  y2,y3  =  y3,y4  =  va) 

=  P{Yi-yuY7-y3)'P{ys-y3,Y4  =  yi) 


Each  pair  from  tie  same  switch  has  tie  following  joint 
distribution  for  i  =  1,2, . ,.N/2: 


P(ys,-_i  =  i,y2,  =  1) 

P(y2f_i  =  i,y2,-  =  0) 

p(y2,-_i  =  o,y2.-  =  i) 
p(y2,'-i  =  o,y2,-  =  o) 


^p®+p(i-p) 

^p’+p(i-p) 

(i-p)’ 


After  the  requests  pass  the  first  stage,  the  marginal  dis¬ 
tributions  of  the  requests  that  advanced  remain  uniform, 
but  these  requests  are  no  longer  independent.  It  is  this 
aspect  that  we  shall  focus  on  next. 

B,  Injection  of  Dummy  Requests 

Consider  a  dominating  system  that  adds  ‘dummy’  re¬ 
quests  at  the  outputs  of  the  switches  to  make  the  output 
request  distributions  uniform  and  independent.  Specific¬ 
ally,  consider  an  addition  strategy  in  which  the  switch,  upon 
seeing  a  [1,0]  or  [0, 1]  at  its  output,  independently  produces 
an  extra  request  along  the  empty  path  with  probability  o. 
The  elements  of  [iop^boiiom]  are  indicator  random  vari¬ 
ables  signifying  the  presence  or  absence  of  a  request  at  the 
output  of  the  top  and  bottom  ports  of  the  switch.  For  ex¬ 
ample,  [0, 1)  implies  that  there  is  no  request  present  on  the 
top  output  of  the  switch  but  there  is  a  request  present  on 
the  bottom  output  of  the  switch.  The  destinations  of  theses 
injected  requests  are  chosen  uniformly  among  all  possible 
destinations. 

There  are  four  aspects  of  the  request  addition  policy  that 
are  central  to  the  subsequent  analysis. 

1.  The  switch  never  removes  requests,  but  sometimes 
adds  requests.  Consequently,  the  dominating  system 
can  only  be  worse  than  the  actual  system  in  perform¬ 
ance  for  every  sample  path. 

2.  No  ‘dummy’  requests  are  injected  if  the  switch  sees  a 
[0,0]  at  its  output. 

3.  Each  switch  makes  its  decision  regarding  the  injection 
.of  dummy  requests  independently  of  the  other  switches 
hence,  no  new  dependencies  across  switches  are  intro¬ 
duced. 

4.  Requests  additions  are  made  independently  of  whether 
or  not  contention  occurred  within  the  switch.  Con¬ 
sequently,  contention  has  no  effect  on  the  request  ad¬ 
dition  strategy. 

The  challenge  is  to  find  a  legitimate  probability  a  such 
that  y  is  a  vector  of  uniform,  independent  random  vari¬ 
ables.  Towards  this  end,  note  that  the  distribution  at  the 
switch  outputs  that  results  from  the  injection  of  new  re¬ 
quests  in  accordance  with  the  strategy  just  described  is 
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joint  distribuUon  that  results  is: 


a  by: 


P{X7i-l==hY3i  =  0)  = 

p(yjf-i  =  o,y2,  =  i)  = 
p(y2f-i  =  o.y2.-  =  o)  = 


~  +  2a(p(l  —  p)  +  “P^) 
(l-o)(p(l-p)  +  ^p’) 
(l-o)(p(l-p)  +  ip’) 
(1-p)’. 


2 

P(Hjf-i  =  l,Hs{  =  l)  =  ^2^ 

P(jR2f.i  =  l,R2.-  =  0)  =  (ip’)(l-^) 
P(P2._i  =  0,P2.  =  l)  = 

2 

P(P2.--l  =  0,P2.-  =  0)  =  1-^ 


To  engineer  independence,  the  joint  probabilities  must 
equal  the  product  of  the  marginal  probabilities. 

p(y2._i  =  i.yji  =  1)  =  p(y2f-i  =  i)P(y2i  =  1) 
p(y2f-i  =  o.y2.  =  o)  =  P(y2.-i  =  o)P{y2.- =  0). 


The  marginal  probabilities  of  y2;-i  and  y2i  can  be  ob¬ 
tained  by  adding  the  appropriate  joint  probabilities.  For 
example,  P{y2{-i  =  1)  =  P{Y7i-i  =  liy2i  =  1)  + 
P{Y2i-i  =  l,y2i  =  0).  In  this  way,  we  can  deduce  that, 

P(y2i-fc  =  1)  =  P*(7+a^)+p(l-p)(Q+l)  k  =  0, 1. 

(1) 

Since  the  joint  probability  of  both  outputs  equaling  zero 
is  not  altered,  the  marginal  probability  of  equaling  zero, 
P{Yj  =  0),  must  be  the  square  root  of  the  jomt  probability 
or,  (1  —  p).  Therefore,  P{Y j  =  0)  =  (1  —  p).  Hence,  a  we 
must  satisfy  the  equation: 

1  -  b’(|  +  + p(i  -  p)b + 1)1  =  (1  -  p)-  (2) 

The  usefulness  of  this  strategy  depends  on  whether  this 
equation  in  a  has  a  solution  in  the  interval  (0, 1).  Solving, 
w»e  find  that  a‘=  p/(4  ~  3p).  Furthermore,  for  all  values 
of  p  in  (0, 1),  a  can  be  easily  verified  to  be  in  (0, 1).  We 
have  thus  established  that  by  using  this  a  ==  p/(4  —  3p),  we 
can  engineer  the  distribution  at  the  output  of  every  stage 
to  be  uniform  and  independent.  With  this  choice  of  q,  the 
marginal  probability  of  having  a  request  remains  constant 
across  all  stages  and  equals  p  . 

It  turns  out  that  by  adding  requests  to  the  residual  re¬ 
quests  in  an  appropriate  manner  they  too  can  be  made  uni¬ 
form  and  independent  across  all  inputs.  To  see  this,  con¬ 
sider  adding  to  the  residual  requests  as  they  pass  through 
the  first  stage  of  switches  (the  stage  in  which  they  were  held 
back).  As  with  the  forwarded  requests,  the  swdtch  adds  a 
request  with  probability  P  {P  is  chosen  irrespective  of  a)  if 
there  is  a  single  empty  output  at  the  switch.  Since  there  can 
be  a  maximum  of  one  request  per  switch  at  the  inputs,  these 
requests  pass  through  the  first  stage  without  contention. 

We  define  a  random  variable  H  to  be  the  vector  random 
variable  indicating  whether  a  request  is  present  on  each 
output  link  of  the  first  stage  due  to  the  residual  requests. 
Using  the  request  addition  strategy  described,  the  pairwise 


To  make  this  joint  distribution  independent,  we  must  first 
deduce  the  marginal  probabilities  by  appropriately  sum¬ 
ming  the  joint  probabilities.  By  setting  the  joint  probabil¬ 
ities  equal  to  the  product  of  the  marginals  we  deduce  that 
P  must  satisfy: 

For  p  7^  0,  this  equation  has  the  solution: 
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For  p  =  0  all  joint  and  marginal  probabilities  are  0  and 
there  is  no  need  to  add  any  ‘dummy’  requests.  It  is  easily 
verified  for  p  #  0,  that  /?  is  a  legitimate  probability  (i.e. 

0  </?<!). 

By  using  the  P  derived  above,  we  can  compute  the  mar¬ 
ginal  probabilities  of  the  residual  requests  in  the  dominant 
system  to  be 

P(1^  =  1)  =  {1-^1-^).  .  (5) 

In  summary,  the  fact  that  a  and  p  are  between  0  and  1 
for  all  marginal  input  probabilities  p,  proves  that  a  request 
addition  strategy  satisfying  the  four  properties  stated  above 
exists.  In  the  next  section  we  deduce  a  recursive  expression 
for  the  cycle  time.  In  doing  so,  we  wdll  exploit  the  fact  that 
the  forwarded  and  residual  requests  are  uniform  and  inde¬ 
pendent  across  all  inputs  and  that  the  respective  marginal 

probabilities  are  p  and  (1  —  —  ^). 

IV.  Derivation  of  the  Distribution  Function 

As  stated  in  the  introduction,  the  probability  mass  /unc- 
iion  provides  complete  information  about  the  delay  distri¬ 
bution,  and  as  such,  is  valuable  in  determining  a  number  of 
performance  characteristics  including  the  effect  of  prema¬ 
turely  truncating  the  cycle  to  increase  the  network  through¬ 
put.  Thus,  we  need  to  determine  the  probability  mass  func¬ 
tion  of  the  time  required  to  service  a  batch  of  independent 
and  uniform  requests.  We  shall  do  so  by  analyzing  the 
dominant  system  described  in  the  previous  section.  The 
approximation  technique  presented  in  this  section  makes  it 
possible  to  derive  the  probability  mass  function  as  well  as 
expressions  for  higher  moments  of  the  delay.  The  results 
of  this  section  will  be  compared  against  simulations  in  the 
results  section. 
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A.  Notation 

Let  /(X,/:)  denote  the  number  of  switching  cycles  re¬ 
quired  to  set  up  circuits  for  a  set  of  requests  represented 
by  the  vector  X  at  the  input  to  k  stages  of  an  7^-input 
network.  As  define  in  Section  III,  each  element  of  X  takes 
the  value  one  with  probability  p  independent  of  other  ele¬ 
ments.  Our  goal  is  to  find  the  probability  mass  function 
of  /(X,!:).  Let  the  function  F(*)  denote  the  set  of  forwar¬ 
ded  requests.  Similarly,  let  G(*)  denote  the  set  of  residual 
requests  remaining  including  ‘dummy*  requests.  For  con¬ 
venience,  define  two  indicator  random  variables  {•)  and 
''’k^re  //‘(x)  =  1  if  the  vector  x  has  any  non-2ero 
elements  and  //'(x)  =  0  otherwise.  /fi(X)  =  1  if  the  X 
requests  will  result  in  contention.  For  convenience  we  not¬ 
ate  the  events  {/f(*)  =  1},  {/;?{*)  =  1},  =  0)  and 

{/;?(•)  =  0),  by  /f(-),  /;?(•),  /p(.)«  and  /h(  )*  respectively. 

B,  Recursion 

The  quantity  of  primary  interest  is  Pr[f{X^k)  =  m). 
Conditioning  on  the  event  {Jp{X)}  we  have 

Pr[f{X,k)  =  m]  (6) 

=  Pr[f{X.  k)  =  m\lF{X)]  ■  Pr[/f  (X)) 

+  Pr(/{X,  k)  =  m]/f  (X)*) .  Pr[JF{Xn 

where  Pr[/jc*(X)]  =  1  —  (1  —  p)^.  The  conditional  probab¬ 
ility  from  the  second  term  of  (6)  is  equal  to: 

Pr[/(X.A-)  =  miJf(Xn=|  5 

The  conditional  probability  in  the  first  term  of  (6)  can  be 
further  conditioned  on  Jr{X)^  defined  to  be  the  event  that 
there  is  blocking  as  a  result  of  contention  in  the  first  set  of 
switches. 

Pr[/(X.A-)  =  ml//-(X)]  (7) 

=  Prl/(X.  k)  =  m\lF{X),  /fl(X)) .  Pr[/fl(X)|/f  (X)) 

+  Pr[/(X.  k)  =  ml/f  (X),  /ji(X)') .  Pr[/;,(X)‘l/f  (X)] 

For  an  initial  input  distribution  with  marginal  probabil¬ 
ity  p: 

1  -  (1  - 

Pr[/«(X)|/f  (X)]  =  ^  (8) 

The  event  {Ir{X)  =  1}  denotes  the  presence  of  one  or 
more  requests  in  the  vector  X.  At  least  one  of  these  re¬ 
quests  will  be  forw-arded  to  the  next  stage  in  one  switching 
cycle  (there  is  a  w'inner  for  every  contention).  Therefore 
event  {Ijr[X)  =  1}  implies  that  {Ijr{F(X))  =  1),  or  w*rit- 
ten  mathematically, 

{/f  (X)  =  1}  D  {/f  (P(X))  =  1}. 

The  second  property  that  we  have  imposed  on  the  dominant 
(Sec.  Ill)  system  guarantees  that  no  requests  will  be  added 
at  a  switch  if  there  were  no  requests  at  its  inputs.  This 
property  ensures  that  the  event  {/f(^{X))  =1)  implies 


that  there  must  have  been  at  least  one  requests  in  the  vector 
Xor 

{/f  (P(X))  =  1}  D  {/f  (X)  =  1}. 

Combining  these  two  statements,  we  can  write 

{/HF(X))  =  i}  =  {/HX)  =  i} 
or  in  our  shorthand  notation  defined  in  Section  IV-A, 

/f(X)  =  Jr{F{X)). 

A  similar  observation  can  be  made  about  the  event 
{7f(X)  =r  1,/f(X)  =  1}  and  {If{R{X))  =  1).  The  pres¬ 
ence  of  requests  in  the  vector  f  (X)  along  with  the  event 
that  contention  occurred,  ensures  that  there  will  be  at  least 
one  requests  in  vector  i?(X).  Again,  because  the  dominant 
system  adds  no  requests  upon  seeing  a  [0,0]  at  its  output, 
the  presence  of  residual  requests  in  the  vector  R[X)  en¬ 
sures  that  contention  occurred  {{Ir{X)  =  1))  and  there 
there  were  original  requests  in  the  vector  X.  Formally,  we 
can  write,  in  the  shorthand  notation 

/F(X),/,?(X)=:/F(i?{X)). 

We  can  utilize  these  relationships  to  further  simplify  the 
first  term  in  the  expression  (7),  namely 

Pr[/(X,/:)=:mI/F(X),/;i(X)] 

f  Pr[/(X,  k)  =  ml/j.(X),  /f(F{X)),  Jr{X)1 
_  I  2Jb  <  m  <  2(2^^  -  1) 

^10, 

otherwise 

^  Pr[/(P(X),^-l).^/(P(X),fc-l) 

=  m-21/;i(X),7r(P(X))], 

-  <  2k<m<  2{2^  -  1) 

0, 

otherwise. 

(9) 

In  equation  (9)  the  range  of  m  is  bounded  from  below 
by  2^  because  it  is  impossible  for  a  network  cycle  to  take 
less  than  2k  switching  cycles  given  7f(X)  and  7h(X).  m 
is  upper  bounded  by  the  number  of  cycles  required  if  there 
is  contention  at  every  switching  cycle  where  contention  is 
possible. 

By  induction,  one  can  show  that  the  maximum  number  of . 
cycles  required  is  2(2*^  —  1)  for  a  I:  stage  network.  To  verify 
the  induction,  consider  a  single  2x2  switch.  The  maximum 
number  of  cycles  required  is  2.  Assume  that  it  requires  a 
maximum  of  2(2^  —  1)  cycles  for  a  k  stage  network.  The 
requests  at  the  input  to  a  -f  1  stage  network  will  require 
one  cycle  to  reach  the  input  to  the  remaining  k  stages.  The 
residual  request  left  over  at  the  first  stage  (assuming  a  worst 
case)  will  only  require  one  cycle  to  reach  the  input  to  the 
remaining  k  stages.  Thus  it  will  require  2  -f  2  •  2(2*^  —  1) 
or  2(2*'*'^  “  1)  stages  for  a  1  stage  network,  completing 
the  induction. 

The  last  statement  of  (9)  follows,  because  the  residual  re¬ 
quests  will  pass  the  first  stage  in  one  switching  cycle  since 
there  can  be  no  contention.  Conditioned  on  7ii(X),  we 
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know  that  there  will  be  both  residual  and  forwarded  re-  Combining  (10)  and  (11)  we  have 
quests,  and  therefore,  since  we  have  engineered  the  forwar¬ 
ded  and  residual  request  distributions  to  be  uniform  and 
independent  given  F{X.)  and  /jj(X),  the  last  statement  fol¬ 
lows.  ^ 

Equation  (9)  can  be  rewritten  as  a  sum  over  the  set  F  = 

{(«.i) : » +  i  =  -  2}  as 


Pr{f{X,k)  =  m\lF{X),Iji{X)] 

'  53  Pr[mX),k-l)  =  i, 

(iJ)er 

_  I  mX),k  -  1)  =  ;17/-(F(X)). 7;,(X)] 

~  1  2k<m<  2(2*  -  1) 

O' 

otherwise 

'  ^Pr[f{F{X),k-  1)  =  il7f(F(X)),7fl(X)] 

_  I  !prI/(7?(X),  k-l)=  ;17f  (P(X)).  7;?(X)] 

~  1  2fc  <  m  <  2(2*  -  1) 

O' 

otherwise 

^  ^  Pr[f{F{X),k^l)^i\lF{F{X))] 

^  ]  2k<m<  2(2^  -  1) 

otherwise 

(10) 

The  second  to  last  step  here  is  an  approximation,  since 
correlations  may  exist  between  the  forwarded  and  residual 
requests.  The  dominant  system  does  not  remove  all  of  the 
dependence  among  the  requests.  We  address  the  accuracy 
of  this  appro^dmation  in  the  results  section.  The  last  step 
follow^s  from  the  engineered  independence  in  the  dominant 
system. 

In  a  similar  fashion,  we  can  find  the  expression  for  the 
first  term  of  (7).  Noting  that  even  if  there  is  no  contention, 
it  will  require  at  least  k  switching  cycles  to  process  the 
requests  given  /r(X),  we  may  deduce  that 

Fr[f{X,k)=^m\Ir{'X)Jji{XY] 

(  Pr[/(X,  k)  =  ml7f  (X).  7fl(X)*,  7f  (P(X))), 

fc<m<2(2*-i-l)  +  l 

~  ]  0. 

otherwise 

*  Pr[/(X,  k)  =  mI7f  (P(X)).  7;j(X)‘], 

I  A- <  m  <  2(2*-^  -  1)  +  1 

“10. 

otherwise 

’  Pr[f{F{X),k  -  1)  =  m  -  ll7f(P(X)),7fl(X)'], 

I  ;t<T7i<2(2*-i-l)  +  l 

“  I  0. 

otherwise 

'  Pr[/(P(X),/:-l)  =  m-l|7f(P(X))]. 

i<m<2(2*-^-l)  +  l 

"  ]  0. 

otherwise 

(11) 


Pr[/(X.7-)  =  ml7f(X)] 


Pr[f{F{X),k-l)  =  m-l\lF{FiX))] 
.Pr[lF{XY\lF{X)], 

k<m<2k 

Pr[/(P(X),  k-l)  =  m-  ll7f  (P(X))] 
.Prl7/i(X)'17f(X)] 

+  E  ^  -  l5‘  =  ^mF{X))] 

(o.Oer 

.Pr[/(P(X),l:-l)  =  fcl7f(71(X))] 
.Pr[7j^(X)I75-(P(X))). 

2k<m<  2(2*-*  -  1)  +  1 

53  PrI/(P(X).  k  -  1)  =  al7jr(P{X))] 

(o.tler 

■Pr[f{R{X),k-l)  =  b\lF{R{X))] 
■Pr[lF{X)\lF{F{X))], 

2(2*-*  -  1)  +  1  <  m  <  2(2*  -  1). 

(12) 


The  probabilities  Pr[7/j(X)17f-(X)]  and 
Pr[7fl(X)*17/-(X)]  are  found  explicitly  from  the  marginal 
probabilities  and  the  independence  in  the  dominant  system 
derived  in  Section  111.  Computations  of  the  probabilities 
follows  from  first  principles.  For  example,  Pr[7f  (X)]  is  the 
probability  that  there  is  at  least  one  request  at  the  input. 
Since  requests  are  independent  with  marginal  probability 
p,  this  is  (1  —  (1  —  p)'^)  where  N  is  the  number  of  inputs  to 
the  stage.  Substituting  the  expression  for  Pr[7/j(X)17i:-(X)] 
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and  Pr[7fl(X)*|//'(X)]  derived  earlier, 
Pr[f{X,k)  =  m\Ir{X)] 


m<  k 

Pr[f{F{X),  k-l)  =  m-  l!/f  (F(X))] 

k<m<2k 

Pr[f{F{X),  k-l)  =  m-  11/f  (F(X))] 

i-u-pr 

=  J  ^  Pr[/(F(X).A--l)  =  al7f(F(X))] 

1  (o.fclcr 

.Pr[/(P(X),^-l)  =  617f(P(X))] 

■  ■  ’ 

2k<m<  2(2*-i  -  1)  +  1 

52  Pr[/(P{X).i-l)  =  aI7f(P(X))] 

(e,t)€r 

.Pr[/(P(X),fc-l)  =  iI7f(P(X))] 

2(2*“i  -  1)  +  1  <  m  <  2(2^^  -  1). 

(13) 

To  complete  the  derivation,  we  need  to  compute 
Pr[/(X,  1)  =  ml//'(X)].  The  total  time  for  all  the  request 
to  be  served  through  one  stage  is  the  maximum  (over  all 
switches)  of  the  time  required  to  forward  pairs  of  requests 
through  each  switch.  Since  the  inputs  to  the  switches  are  in¬ 
dependent  due  to  the  dominant  system  construction,  meth¬ 
ods  of  independent  order  statistics  can  be  used.  Assuming 
elements  of  the  vector  X  at  the  last  stage  have  marginal 
probability  we  use  order  statistics  to  show 


summing  on  both  sides  of  (13)  yields 

E[f{X,k)\h{X)] 

ik-i 

=  J2rnPr[f{F{X),k-l)  =  m-l\lF{F{X))] 

i-u-pr 

2(2*”*-l)+l 

+  E  ”’^'-[/{P(X),  i  -  1)  =  m  -  117f  (P(X))] 

i-(i-r)" 

2(2*-l) 

+  E  E  mPr[mX),k  -  1)  =  cl7,.(P(X))] 

rn=2*  (a.fc)er 

.Pr[/(P{X),  A  -  1)  =  bMR{X))] 

(15) 

Combining  the  first  two  terms  of  (15)  and  substituting  s  for 
m  —  1  yields 

52  (s  +  l)Pr[/(P{X),  1:  -  1)  =  5l7f  (P(X))] 

which  equals  ^ 


(£[/(P(X).i-l)I7j.(P(X))]  +  l) 


l-ll-!)"  ’ 


m  =  0, 


m  =  1, 


I  <  m  —  2. 

(14) 

Equations  (13)  and  (14)  form  the  recursive  solution  for 
the  probability  mass  function  of  /(X,/:)  given  the  event 
//'(X).  Once  this  quantity  is  computed,  the  conditioning 
on  /f(X)  is  removed  using  (6).  This  yields  an  equation  for 
the  PrI/(X,i:)  =  m). 

C.  Delay  Moments  in  the  Dominating  System 

In  this  section,  we  illustrate  how  the  recursive  equation 
(13)  just  derived  can  be  used  to  obtain  higher  moments.  We 
compute  the  first  moment,  cross  check  it  against  the  bound 
produced  in  [10],  and  then  compute  the  second  moment. 

To  find  P[/(X,/:)]  we  first  determine  P[/(X, /:)|7ir(X)) 
and  then  remove  the  conditioning.  Multiplying  by  m  and 


Examining  the  last  term  of  (15)  it  is  noted  that 
Pr[f{F[X),k-l)  =  a\lF{F{X))] 

and 

Pr[/(P(X),^-l)  =  i|7j.(P(X))] 
are  non-zero  only  for  k  <  (a,fc)  <  2(2*'“*  —  1).  Hence 

2(2*-l) 

E  E  mPr\f{F{X),  k-l)  =  al7jr(P(X))] 

m=2*  (o,t)€r  (10 

■Pr[/(P(X),  k-l)  =  fc|7f  (P(X))] . 
is  equal  to 

2{2‘-’-l)2(2‘->-l)  ,  gS.V 

E  E  a») 

o=:fc-l 

.Pr[/(P(X).fc-l)  =  Ql7f(P(X))] 
.Pr[/(P(X).i-l)  =  il7f(P(X))] 

which  equals 

(2-^P[/(P(X).fc-l)17f(P(X))]  (19) 

-f P[/(P(X). k  -  l)I7f  (P(X))))  • 
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mbining  (16)  and  (19)  we  obtain  the  conditional  re- 
.ve  equation 


f^[/(X,^)17f(X)] 

=  {E[f{F{X),k-l)mF{X))]  +  l) 

I-(I-P)^ 

+  {E[f{F{X),k-l)\lF{FiX))] 

+E[mX),k-l)MRm-i-^) 

■  I-(I-P)'' 


(20) 


Multiplying  both  sides  by  1  -  (1  -  p)^,  removing  the  con- 
ditioning  in  each  term  and  rearranging  terms  we  get: 


E[nX,k)]  =  E[f{F{X),k-l)]  +  E[f{R{X),k-l)] 

+  2-(l-^)^-(l-p)"'.  (21) 


-kE[nR{X),k-l)^\lF{RiX))] 

+  {2E[f{F{X),k-l)\lF{F(X))] 
-E[f{R{X),k-l)\lF{R[X))]) 
+4E[f{F{X),k-l)\lF[F{X))] 
+AElf{R{X),k-l)\lF{R{X))]  +  4) 

.  Pr[lF{F{X))\lF{X)].  (23) 


To  complete  the  recursion 

we  need  to  find  jE[/(X,l)^l/i:*(X)].  This  is  computed  to 
be 


EU{X,  (X)l  =  •*  ~  P<) 


Combining  (23)  and  (20)  yields  a  conditional  equation  for 
the  variance  of  the  cycle  time.  The  unconditional  expres¬ 
sions  are  then  easily  derived.  In  the  results  section  we 
compare  the  actual  results  obtained  via  simulation  with  the 
computed  upper  bound. 


This  equation  is  identical  to  the  equation  for  the  mean  de¬ 
rived  in  [10].  Thus  the  approximation  (Eqn.  10)  made  in 
assuming  independence  of  the  forwarded  and  residual  re¬ 
quests  does  not  have  any  effect  on  the  first  moment. 

The  equation  for  the  second  moment  and  thus  the  vari¬ 
ance  of  the  cycle  time  can  be  computed  by  examining  the 
conditional  recursive  equations  for  the  first  and  second  mo¬ 
ments.  The  conditional  recursive  equation  for  the  first  mo¬ 
ment  is  showrf  in  (20).  To  find  the  recursive  equation  for 
the  second  moment  multiply  both  sides  of  (13)  by  and 
sum  over  all  values  of  m.  This  yields 

Elf{X,ky\lF{X)] 

=  nv‘Pr[f{F{X),k-  1)  =  m  -  117/-(F(X))] 

-»  NT 

(i-Vl'f-(i-p)'' 

- 

+  m^Pr[f{F{X),  k-l)  =  m-  117f(F(X))] 

m=i2^c 

2{2‘'-J) 

n,=2*  (a.b)6r  ^ 

.pr(/(j?(x),  k-i)= b\iF{R{x))]  • 

(22) 

Using  the  same  technique  used  to  solve  (15)  we  can  solve 

this  yielding 

E[f{X,kf\Al] 

=  Pr(7>(P(x))17f(X)K£[/(F(X)./:-  l)=|7f(F(X))] 

+2£[/(P(X).1:-1)|7HF(X))]+1) 

+  (£[/(P(X).fc-l)=17f(P(X))) 


V.  Results 

We  now  present  the  results  of  the  mass  function  and  mo¬ 
ment  analysis.  To  examine  the  tightness  of  the  bounds  pro¬ 
duced  by  various  techniques,  we  compare  these  results  to  a 
simulation  of  the  actual  gated  hold  protocol  where  no  ad¬ 
ditional  requests  are  injected.  Simulations  were  run  wdth 
10,000  and  100,000  iterations  and  showed  a  maximum  dif¬ 
ference  of  0.6%.  Most  simulations  presented  here  were  run 
with  10,000  trials.  ^ 

The  mass  functions  for  an  8-input  network  are  shown 
in  Fig  2.  A  different  delay  mass  function  results  for  each 
choice  of  the  marginal  input  probability  p.  It  is  clear  from 
this  plot,  that  for  moderate  values  of  p,  the  mass  f\mction 
for  the  cycle  time  in  the  dominant  system  is  bimodal  As  p 
increases,  and  the  weight  of  the  mass  function  shifts  to  the 
right,  there  are  situations  where  the  graph  shows  that  two 
distinct  peaks  in  the  mass  function  exist.  To  verify  the  pres¬ 
ence  of  this  feature  in  the  delay  distribution  of  the  actual 
system,  a  simulation  of  the  actual  system  w’as  conducted. 

Results  in  Fig.  3  4  indicate  that  the  delay  distribution 
in  the  dominant  system  and  actual  system  closely  match 
for  N  Z2  and  moderate  values  of  p.  To  more  accur¬ 
ately  judge  the  error  in  the  mass  function  analysis,  the  dis¬ 
tribution  function  is  compared  to  an  estimate  of  the  true 
distribution  function  obtained  from  simulation  wdth  100000 
trials.  The  results  show  (Fig.  5  and  6)  that  the  distribution 
function  from  the  analysis  of  the  dominant  system  bounds 
the  distribution  function  obtained  via  simulation.  It  can 
also  be  seen  that  this  bound  grows  looser  with  increasing 
N  and  p. 

The  bimodality  of  the  mass  functions  over  certain  ranges 
of  p  suggests  that  when  the  network  experiences  a  moderate 
amount  of  contention,  requests  are  served  in  two  bursts. 
The  bimodality  suggests  that  contention  is  most  common 
at  the  initial  stages  resulting  in  an  early  splitting  into  two 
relatively  non-conflicting  groups.  These  two  groups  could 
account  for  the  two  peaks. 
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We  compare  the  moments  derived  from  the  recursive 
equation  for  the  mass  function  of  the  network  cycle  time 
to  those  obtained  by  simulation  of  the  actual  system.  The 
simulation  results  are  based  on  10,000  trials.  For  the  \'al- 
ues  of  N  considered,  the  bounds  on  the  first  moment  [10] 
are  tight  for  moderate  values  of  p  (Fig  7).  The  second  mo¬ 
ment  bounds  become  looser  for  increasing  N  (Fig.  8),  but 
may  nonetheless  enable  system  designers  to  factor  in  second 
moment  statistics. 

Note  that  although  all  moments  of  the  dominating  system 
bound  the  corresponding  moments  of  the  actual  system,  the 
uononce  is  not  a  moment  and  thus  it  is  unclear  whether  the 
variance  of  the  dominant  system  bounds  the  variance  of  the 
actual  system.  Results  clearly  show  that  the  variance  does 
not  bound  the  actual  system  variance.  However,  the  effect 
of  p  on  the  variance  of  the  network  cycle  time  is  accurately 
predicted  by  the  analytical  curves.  The  increase  in  variance 
for  moderate  values  of  p  coincides  with  the  bimodal  nature 
of  the  mass  function  in  this  range.  Because  the  variance 
is  not  strictly  decreasing  in  p,  there  may  exist  an  optimal  • 
choice  of  p  in  systems  that  use  this  protocol.  As  we  sugges¬ 
ted  in  the  introduction,  systems  that  must  provide  a  high 
degree  of  parallel  service  often  rely  upon  a  set  of  processors 
completing  their  tasks  simultaneously.  This  is  easiest  to 
accomplish  when  the  netw^ork  performance  is  predictable. 
Thus  if  one  can  tailor  the  speedup  of  the  network  so  that 
it  is  operating  in  a  low  variance  mode,  better  system  per¬ 
formance  may  be  obtained, 

VI.  Discussion 

As  mentioned  in  the  introduction,  this  protocol  provides 
cycle  by  cycle  FCFS  service  and  prevents  the  mixing  of 
requests  to  reduce  delay  variability.  Some  of  this  may  occur 
at  the  expense  of  w’asting  network  resources  by  bolding  back 
requests  when  only  a  small  ntimber  are  using  the  netw’ork. 
If  even  a  small  fraction  of  the  old  requests  are  dropped  and 
made  to  resubmit  in  the  next  cycle,  the  system  performance 
may  further  improve  even  though  the  request  distribution 
will  no  longer  remain  independent. 

Suppose  w’e  w’ere  to  truncate  netw»ork  cycles,  so  they 
never  exceed  C  switching  cycles.  With  a  certain  probab¬ 
ility,  upper  bounded  by  the  mass  function  derived  in  this 
paper,  requests  will  not  be  fully  served  in  the  C  cycles.  As¬ 
sume  for  simplicity,  that  these  requests  are  dropped  to  the 
beginning  of  the  network  for  resubmission  in  the  next  net¬ 
work  cycle.  Note  that  the  requests  are  no  longer  independ¬ 
ent.  However,  for  a  low  resubmission  rate,  the  dependence 
will  be  very  small. 

To  determine  the  number  of  requests  remaining  in  the 
network  when  the  cycle  is  truncated,  consider  several  pos¬ 
sibilities.  The  simplest  bound  is  to  assume  that  every  input 
port  contained  a  request  and  that  none  of  the  requests  were 
served  (this  is  impossible  for  C  >  /r-fd).  As  a  slightly  better 
bound,  we  could  calculate  the  number  of  requests  that  re¬ 
main  assuming  all  switch  inputs  had  requests  with  identical 
destinations.  Alternately,  as  a  conservative  approximation, 
one  could  assume  that  there  are  at  most  Np  requests  re¬ 
maining  on  average.  This  is  the  average  number  submitted 


per  network  cycle  which  is  less  than  the  average  number  of 
requests  that  remain  after  truncation.  We  call  this  an  ap¬ 
proximation,  because,  unlike  the  two  sample  path  bounds 
above,  this  is  based  on  expected  values  and  the  number 
of  requests  remaining  at  the  end  of  a  network  cycle  is  not 
strictly  less  than  Np, 

As  an  example  consider  a  16  input  netw’ork.  WTiat  is  a 
bound  on  the  dropped  request  rate  w’hen  the  net  cycle  is 
truncated  to  9  switching  cycles?  From  the  mass  function 
derived  earlier,  the  probability  of  the  network  cycle  last¬ 
ing  longer  than  9  switching  cycles  is  bounded  by  .04  for  a 
value  of  p  =  .2.  Assuming  that  we  bound  the  number  of 
requests  remaining  in  the  system  by  the  second  bound  dis¬ 
cussed  above,  there  can  be  at  most  12  requests  remaining 
in  the  system  a  (This  is  a  very  loose  bound).  Assuming 
that  the  dropped  requests  are  equally  likely  to  be  from  any 
input  port  (as  the  switch  does  not  assign  priority),  an  up¬ 
per  bound  on  the  dropped  request  resubmission  rate  is  .03 
for  each  port.  If  we  use  the  conservative  approximation, 
we  obtain  a  dropped  request  resubmission  rate  of  .008  per 
port. 

It  is  possible  to  increase  the  efficiency  of  the  proposed 
protocol.  To  see  this  note  that  at  times,  requests  w’hich 
are  held  in  the  netw^ork  at  a  particular  stage  may  be  able 
to  advance  even  before  all  requests  ahead  of  them  have 
been  serviced.  In  the  above  example,  tw'o  requests  are 
held  in  the  network  for  several  switching  cycles  w*hen,  in 
fact,  they  could  have  advanced  to  their  destinations  without 
contention  (Fig.  le).  The  protocol  described  in  this  paper, 
how»ever,  clearly  bounds  a  protocol  incorporating  this  modi¬ 
fication,  makingit  useful  in  the  design  of  a  system  using 
this  improvement.  Modifying  the  protocol  to  allow  this  sort 
of  stochastic,  opportunistic  advancement  is  not  pursued  in 
this  paper. 

VII.  Conclusions 

In  this  paper,  we  considered  a  T^-input  square  Delta  net¬ 
work  implementing  a  gated-hold  protocol  and  studied  the 
probability  mass  function  of  its  cycle  time.  The  derivations 
w»ere  validated  against  simulations. 

The  quantitative  results  for  =  8, 16,32,  and  64  show 
that  the  mass  function  is,  in  general,  not  unimodal.  This 
information,  not  available  from  the  moments,  provides  valu¬ 
able  insight  into  understanding  and  possibly  modifying  this 
protocol  to  increase  its  efficiency.  The  time  required  to 
compute  the  mass  function  of  the  delay  via  this  recursive 
equation  grow’s  rapidly  w’ith  However,  a  technique  was 
demonstrated  which  allows  one  to  derive  recursive  equa¬ 
tions  for  arbitrary  moments  of  the  delay  distribution.  The 
recursive  equations  for  these  higher  moments  require  signi¬ 
ficantly  less  computational  effort. 

A  bound  on  the  second  moment  of  the  gated-hold  pro¬ 
tocol  was  explicitly  derived  here.  Although  this  bound 
grows  looser  wdlh  increasing  N^  it  provides  a  measure  of 
comparison  between  interconnection  schemes.  It  is  conceiv¬ 
able  that  two  interconnection  networks  will  have  the  same 
average  performance,  yet  when  incorporated  into  a  pro¬ 
cessing  system  where  the  performance  of  the  slowest  pro- 
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Fig.  2.  Network  cycle  time  mass  function  of  an  ^input  network 
versus  p  {d  ss  0),  A  delay  mass  function  results  for  each  >'alue  of 
P- 


Cycle  Time  a 

Fig.  3.  Network  cycle  time  mass  function  of  an  32-input  network  for 
psO.l  (d  =  0). 


cessor  is  the  limiting  design  factor  (i.e.  in  the  execution 
of  a  synchronous  algorithm),  one  network  could  drastically 
outperform  the  other.  The  second  moments  statistics  can 
help  predict  such  behavior.  Finally,  the  techniques  used  in 
this  analysis  appear  to  be  quite  promising  and  may  find 
application  in  other  similar  systems. 
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Cycle  Tudc  m 


Fig.  4.  Network  cycle  time  Distribution  function  of  an  32-input  net¬ 
work  for  p  =  0.3  (d  =  0).  Comparison  of  actual  versus  dominant 
systems. 


Fig.  5.  Network  cycle  time  Distribution  function  of  a  16-input  net¬ 
work  for  p  =  0.1, 0.4  {d  =  0).  Comparison  of  actual  versus  dom¬ 
inant  systems. 
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Fjg,  6.  Network  cvxlc  time  Distribution  function  of  &  32-input  net¬ 
work  for  p  =  0.1, 0.4  ((f  =  0).  Comparison  of  actual  versus  dom¬ 
inant  systems. 


Margins]  Input  ProbabUitv  p 


Fig.  9.  Comparison  of  analysis  (solid)  and  simulation  (dashed)  of 
variance  of  the  network  cycle  time  for  A^-input  network  {N  = 
8,16,32,64  ;  d  =  0). 


Fig.  7,  Expected  Network  cycle  time  versus  p  (d  =  0).  Comparison 
of  actual  versus  dominant  systems  for  N  =  8,16,32,64. 
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Fig.  8.  Comparison  of  analysis  (solid)  and  simulation  (dashed)  of 
second  moment  of  network  cycle  time  for  N-input  networks  (N  = 
8,16,32  ;  ds:  0). 
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Multiple  Order  Delay  Holograms  for  Polarization  and  Color  Selectivity 

Fang  Xu,  Rong-Chung  Tyan,  Joseph  E.  Ford*,  and  Yeshayahu  Fainman 
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University  of  California,  San  Diego,  La  Jolla,  CA  92093-0497 
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Diffractive  optical  elements  constructed  as  phase  only  computer  generated  holograms 
(CGHs)  are  attractive  for  numerous  applications  in  photonics  and  optoelectronics.  A  conventional 
diffractive  optical  element  (DOE)  has  a  maximum  phase  delay  of  2k  between  pixels.  Therefore, 
the  required  etch  depth  is,  in  general,  shallow  (<  wavelength  X,  see  Fig.  la).  These  DOEs  are 
relatively  insensitive  to  the  polarization  and  wavelength  of  the  reconstruction  field  compared  to 
volume  gratings.  Previously  we  demonstrated  polarization  selective  diffractive  optical  elements 
using  two  birefringent  LiNb03  substrates  with  different  diffractive  microstructures  on  the 
interior.^  ^  The  required  etch  depths  on  both  substrates  are  deeper  than  that  in  a  conventional  DOE 
because  the  substrates  birefringence  is  relatively  small  (see  Fig.  lb).  Another  approach  to  achieve 
the  same  functionality  is  based  on  deep  etch  structure  (see  Fig.  Ic)  that  corresponds  to  multiple 
periods  of  phase  delays  (also  called  modular  2m7t)  using  a  single  birefringent  substrate,  as  first 
proposed  in  reference  1  and  later  in  reference  3.  This  approach  may  reduce  the  cost  and  simplify 
the  fabrication  process  of  such  polarization  selective  diffractive  optical  elements.  In  the  following 
we  report  the  design,  fabrication  and  characterization  of  multiple  order  delay  (MOD)  holograms 
that  posses  dual  functionality  in  polarization  or  color. 

To  design  a  MOD  hologram  with  dual  impulse  responses  using  single  substrate,  we  use  the 
geometrical  optics  approximation  and  find  the  corresponding  phase  delays  caused  by  an  etched 
pixel  compared  to  that  of  an  unetched  pixel 

ki(ni-n,i)d  =  2lK  +  ^i  „ 

kiOh  ~  =  2/n;r  +  <I>2 

where  ^^+21k  and  <S>^^2mK  are  the  phase  delays  exhibited  by  the  two  independent  optical 
reconstruction  fields,  d  is  etch  depth  of  the  pixel,  ki  and  k2  are  the  wavevectors  of  the  two 
reconstruction  fields,  nj  and  n2  are  the  refractive  indices  of  the  substrate  for  the  two  reconstruction 
fields,  n,i  and  n,2  are  the  refractive  indices  of  the  material  surrounding  the  microstructure,  and  / 
and  m  are  integers  corresponding  to  the  multiple  periods  of  phase  delays  exhibited  by  the  two 
fields.  The  two  independent  reconstruction  fields  can  be  of  different  wavelengths  or  of  orthogonal 
linear  polarizations.  In  general,  Eqs.  (1)  does  not  have  a  unique  accurate  solution  for  d,  if  Oi  and 
<I>2  are  arbitrarily  specified  design  values,  unless  the  refractive  indexes  ni  and  n2  can  be  controlled 
in  every  pixel  of  the  diffractive  element  as  in  a  form  birefringent  artificial  dielectric  nanostructures^. 
However,  for  our  design  with  a  homogeneous  substrate  characterized  by  constant  values  of  nj  and 
n2,  there  exist  only  an  approximate  solutions  for  d  when  the  values  of  integers  I  and  m  are 
arbitrarily  large  such  that 

kj(;ii -l)c/  =  2/;r  +  Oj +(5i  2) 

~  1)^  ~  2inK  +  ^2  +  <^2 

where  6i  and  62  are  small  numbers  representing  the  approximation  errors  and  air  is  used  for  the 
medium  surrounding  the  microstructure  thus  nt2=nti=l.  If  5i  and  82  are  much  smaller  than  the 
value  of  the  phase  quantization  level,  this  is  a  valid  approximate  solution  to  Eqs.  (1).  With  this 
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approach,  it  is  possible  to  design  the  two  independent  phase  functions  within  some  specified 
accuracy  at  each  pixel.  Therefore,  independent  multilevel  phase  holograms  can  be  implemented  for 
the  two  orthogonal  polarizations  or  two  different  wavelengths.  Solving  Eq.(2)  we  find, 

(2/;r+Oi  +^i)Ai  _  (2m;r-i-C>2  +  ^2)A2  (3) 

2;r(«i-l)  2;r(n2-l) 

We  used  the  above  algorithm  to  design  and  demonstrate  two  types  of  MOD  holo^ams  with 
dual  functionality.  The  first  one  is  a  polarization  selective  element  made  of  a  single  birefnngent 
yttrium  orthovanadate  (YVO4)  substrate  and  the  second  one  is  a  wavelength  selective  element  made 
of  BK7.  YVO4  has  large  birefringence  and  can  be  relatively  easy  to  process  using  microfabricauon 
techniques.  We  used  x-cut  YVO4  crystals  grown  by  CASDC,  Inc.  The  no=2.0241  and  ne=2.2600 
of  YVO4  were  determined  at  145  ^m.  Using  Eqs.  (2)  and  (3)  with  these  values  of  refracuve 
indices,  we  find  all  the  possible  combinations  of  <I>i  and  O2  that  are  necessary  for  constructiori  of  a 
binary  phase  single  substrate  birefringent  computer  generated  hologram  (SSBCGH)  (see  Table  U- 
In  Table  1,  d/  and  d„  are  the  calculated  exact  etch  depths  from  Eq.  1  that  are  required  to  obtain 
and  «>2  for  various  integers  /  and  m.  We  observe  from  Table  1  that  by  choosing  a  single  value  for 
each  etch  depth  will  introduce  approximation  errors  of  less  than  about  6%  for  all  cases.  This  error 
can  be  further  reduced  by  taking  the  value  d  as  the  weighted  average  of  di  and  dm  instead  of  one 
half  of  the  summation.  Other  optimizations  such  as  choosing  a  different  set  of  phase  quantization 
bases  may  also  reduce  the  approximation  errors.  Furthermore,  because  the  absolute  phase  in 
diffractive  optics  is  of  no  concern,  we  can  remove  an  etch  depth  bias  of  1.013  fim  (see  Table  1) 
without  affecting  the  desired  relative  phase  values  between  different  pixels.  Therefore,  only  two 
distinct  etches  are  needed  to  construct  a  binary  phase  level  SSBCGH  (s  and  t  in  Table  1). 

For  experimental  demonstration  and  characterizations  of  such  a  SSBCGH,  we  constructed 
a  diffractive  polarization  beam  splitter  that  diffracts  one  polarization  while  transmitting  other. 
This  is  a  special  case  of  the  dual  functionality  element  that  requires  a  single  value  of  etch  depth. 
The  desired  diffractive  structure  was  defined  and  transferred  by  electron  beam  and  photo¬ 
lithographic  processes,  and  then  the  element  was  ion  beam  etched  to  1.032  ^m.  The  duty  ratio  of 
the  fabricated  SSBCGH  element  was  measured  to  be  1:1.  The  experimental  evaluation  of  the 
element  shows  70.8%  diffraction  efficiency  and  79.7:1  polarization  contrast  ratio  (PCR)  mto  the 
zero  order,  37.4%  diffraction  efficiency  and  33:1  PCR  into  the  +lst  order  and,  38.9%  diffraction 
efficiency  and  32.5:1  PCR  into  the  -Ist  order. 

To  better  understand  the  fabrication  accuracy  requirements  and  their  effect  on  the 
performance  of  the  fabricated  SSBCGH  elements  we  used  rigorous  coupled  wave  analysis 
(RCWA)5  to  simulate  the  performances  of  our  fabricated  element.  Fig.  2  shows  the  simulation 
results  for  diffraction  efficiencies  and  PCRs  as  functions  of  etch  depth  for  grating  with  1:1  duty 
ratio  and  vertical  side-walls.  From  the  simulation  results,  we  can  observe  that,  the  good 
performance  (>40%  diffraction  efficiency  and  over  100:1  PCR)  can  be  obtained,  which  is  close  to 
our  geometrical  optics  design.  The  experimental  performance  of  the  fabricated  element  is  veiy  close 
to  that  predicted  by  the  RCWA  (see  Fig.  2).  From  the  simulation,  we  can  also  see  that  the 
performance  of  a  SSBCGH  can  be  further  improved  with  more  accurate  etch  depth.  Also,  the 
RCWA  results  show  that  the  etch  depths  for  the  best  PCR  and  the  largest  diffraction  efficiency  ^e 
very  similar  but  not  identical.  This  important  result  implies  that  the  desired  etch  depth  can  be 
driven  by  the  application  needs  and  may  slightly  differ  from  the  values  provided  by  the  geometnc 
optics  approximate  design  listed  in  Table  1. 
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Using  the  same  approach,  we  also  demonstrated  a  wavelength  selective  element  for 
operation  as  a  color  selective  beamsplitter  for  wavelengths  of  1.30  p-m  and  1.55  p-m.  The  substrate 
material  is  BK7  glass.  The  indices  of  refraction  of  BK7  were  specified  by  the  supplier  (Newport 
Optical  Materials  Inc.)  to  be  1.5027  at  1.30  pm,  and  1,5004  at  1.55  pm.  Using  Eq.  1,  the  phase 
delay  for  a  7.75  pm  deep  etch  is  5.99471  at  1.3  pm  and  5.00471  at  1.55  pm.  This  set  of  values 
provides  the  necessary  phase  delay  for  a  simple  wavelength  beam  splitter  that  transmits  1.3  pm 
light  field  and  deflects  1.55  pm  light  field.  The  element  was  etched  to  7.9  pm  using  chemically 
assisted  ion  beam  etching  method  with  CHF3  as  the  reactive  gas. 

Figure  3  shows  the  measured  diffraction  efficiency  and  the  location  of  each  of  the  first  four 
orders  for  the  fabricated  element.  A  perfectly  fabricated  binary  phase  hologram,  neglecting  Fresnel 
reflection  losses,  should  have  no  energy  in  the  even  orders,  40.5%  in  the  +/-  1st  orders,  and  4.5% 
in  the  +/-  3rd  orders.  At  1,55  pm,  the  diffraction  efficiencies  of  the  fabricated  element  matched 
these  numbers  closely,  with  39%  in  each  of  the  first  orders,  3.6%  in  each  of  the  +/-  3rd  orders, 
and  a  zero  order  transmission  of  0.83%.  At  1.3  pm,  the  transmission  was  83%,  while  the 
diffraction  into  any  of  the  orders  was  less  than  1.2%. 

A  MOD  hologram  is  more  sensitive  to  changes  in  the  illumination  angle  than  a  conventional 
DOE  because  of  the  increased  optical  path  differences.  We  tested  the  effect  of  tilting  the  element, 
and  found  that  the  performance  (first  order  diffraction  efficiency  at  1.55  pm  and  zero  order 
transmission  at  1.3  pm)  changed  by  less  than  2%  for  a  5°  tilt,  and  less  than  10%  for  a  10®  tilt.  In 
fact,  the  overall  performance  was  slightly  improved  with  a  5®  tilt,  suggesting  that  the  etch  depth 
should  be  increased  to  7.93  pm  to  optimize  performance.  A  field  angle  of  10®  indicates  that  these 
elements  are  compatible  with  F/3  and  larger  optics. 

In  conclusions,  we  have  demonstrated  multiple  order  delay,  hologr^s  with  dual  impulse 
responses  in  polarization  or  color.  The  experimental  results  indicate  good  performances.  Such 
elements  may  be  useful  in  image  processing,  optoelectronic  packaging  and  photonic  switching. 

Authors  thank  P.C.  Sun  and  K.  Urquhart  for  helpful  discussions.  The  research  conducted 
at  UCSD  is  funded  by  National  Science  Foundation,  Air  Force  Rome  Laboratory  and  AFOSR. 
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2.261,  2.246 

2.2535 

1.2405  ( =  s+t) 

-2.9 

+3.8 

Table  1  Design  results  and  the  real  etch  depth  required  for  a  binary  phase  single  substrate 
BCGH.  (*  When  Oq  and/or  <I>e  are  zero,  they  are  taken  to  be  27t  for  errors  evaluations.) 
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A  polarization-selective  computer-generated  hologram  fabricated  upon  a  °“ieSS 
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Diffractive  optical  elements  (DOE’s)  constructed 
as  phase-only  computer-generated  holograms  are 
attractive  for  numerous  applications  in  :^tonics 
and  optoelectronics.  In  general,  planar  DOEs  are 
insensitive  to  the  polarization  of  the  illuminatiom 
However,  polarization-selective  computer-generated 
holograms  are  attractive  for  numerous  applications 
because  they  use  polarization  as  another  degree  of 
freedom  to  implement  two  independent  and  arbitrary 
impulse '  responses  for  the  two  orthogonal  linear 
polarizations.  Previously  we  demonstrated  such 
polarization-selective  DOE’s,  using  two  birefringent 
LiNbOa  substrates.^-^  These  birefringent  computer- 
generated  holographic  elements  have  been  used  for 
such  applications  as  transparent  photonic  switching 
and  networking,  image  processing,  and  packaging  of 
optical  and  optoelectronic  devices  and  systems. 

A  normal  DOE  has  a  maximum  phase  delay  of  2^ 
between  pixels.  Therefore  the  required  etch  depth  is, 
in  general,  shallow  [see  Fig.  1(a)].  Previously  demon¬ 
strated  birefringent  computer-generated  holographic 
elements  consist  of  two  substrates  with  different 
diffractive  microstructures  on  the  interior,  in  which 
the  two  independent  surface-relief  depths  provide  us 
with  the  two  degrees  of  freedom  to  encode  the  two 
independent  phase  functions  for  the  two  orthogonal 
linear  polarizations  in  the  same  element. '  The 
required  etch  depths  on  both  substrates  are  deeper 
than  those  in  a  normal  DOE  because  the  birefringence 
is  relatively  small  [see  Fig.  Kb)].  Another  approach 
to  achieve  the  same  functionality  is  based  on  multiple 
periods  of  phase  delays  (also  called  modular  2m;7) 
with  a  single  birefringent  substrate,  as  first  proposed 
in  Ref  1  and  later  in  Ref.  4.  Such  an  approach  may 
reduce  the  cost  and  simplify  the  fabrication  process  of 
such  polarization-selective  diffractive  optical  elements. 
The  required  etch  depth  is  much  deeper  than  that  in  a 
normal  DOE  [see  Fig.  1(c)].  In  this  Letter  we  report 
what  to  our  knowledge  is  the  first  experimental  demon¬ 
stration  of  a  polarization-selective  computer-generated 
hologram  using  a  single  birefringent  substrate.  We 
also  describe  the  design  principles,  the  fabrKation 
procedures,  and  the  experimental  characterization 
results  of  the  single-substrate  birefringent  computer¬ 
generated  hologram  (SSBCGH). 


To  design  a  SSBCGH  element,  we  use  a  multiorder 
phase  microstructure,  in  which  each  pixel  of  the  mi- 
crostructure  is  deeply  etched  such  that  propag^ing 
optical  waves  will  exhibit  multiple  periods  of  phase 
delays.  Consider  a  surface-relief  microstructure  fab¬ 
ricated  in  a  birefringent  substrate  with  the  optic  axis 
parallel  to  the  surface  of  the  substrate.  Using  geomet¬ 
rical  optics,  we  can  find  the  corresponding  phase  de¬ 
lays  caused  by  the  surface  relief  compared  with  those  of 
an  unetched  pixel  for  the  ordinary-  and  extraordinary- 
polarized  waves: 

(2it/A)  (no-nt)d  =  2/ir +  <!>', 

(2ir/A)  (lie  -  nt)d  =  2m;r  -I-  ,  (D 

where  d>'  -l-  21  ir  and  d*'  +  2mir  are  the  phase  de¬ 
lays  exhibited  by  ordinary  and  extraordinary  waves, 
A  is  the  wavelength  of  the  incident  wave  in  vacuuin, 
Uo  and  Tie  are  the  refractive  indices  for  ordinary-  and 
extraordinary-polarized  light,  respectively,  n,  is  the 
refractive  index  of  the  material  surrounding  the  mi- 
crostructure,  and  I  and  m  are  integers  corresponding 
to  the  multiple  periods  of  phase  delays  exhibited  by 
the  ordinary-  and  extraordinary-polarized  light.  In 
general,  Eqs.  (1)  do  not  have  unique  accurate  solu¬ 
tions  for  d  if  dTo  and  are  arbitrarily  specified  de- 
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birefringent 

birefringent 
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(c) 


birefringent 


Fig.  1.  Schematics  of  (a)  a  conventional  DOE  and  (b)  two- 
substrate  and  (c)  a  single-substrate  birefringent  computer- 
generated  hologram. 
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sign  values,  unless  the  refractive  indices  rio  and  rie  can 
be  controlled  in  every  pixel  of  the  diffractive  element, 
as  in  form-birefringent  artificial  dielectric  nanostruc¬ 
tures.^  However,  for  our  design  with  a  homogeneous 
birefringent  substrate  with  constant  values  of  rto  and 
Tie,  there  exist  approximate  solutions  for  d  when  the 
values  of  integers  I  and  m  are  arbitrarily  large  such 
that 

(277/A)  (tIq  ““  l)d  =  2^77  +  Oo  +  5;, 

(277/A)  (rie  “  l)d  =  2m77  +  <[>«  +  Sim ,  (2) 

where  5/  and  Sm  are  small  numbers  introduced  to  ac¬ 
count  for  the  errors  and  air  is  used  as  the  material  sur¬ 
rounding  the  microstructure,  =  1.  If  Si  and  Sm  are 
much  smaller  than  the  phase  quantization  step,  w^e  con¬ 
sider  this  to  be  a  valid  approximate  solution  to  Eqs.  (1). 
With  this  approach  it  is  possible  to  apply  an  arbitrary 
phase  with  some  specified  accuracy  to  each  pixel  at 
each  polarization.  Therefore  independent  multilevel 
phase  holograms  can  be  implemented  for  the  two  po¬ 
larizations.  Solving  Eqs.  (2),  we  obtain 

d  =  (2^^  +  +  d/)A  ^  (2m77  -t-  <l>g  +  dff^)A 

277(71o  “  1)  277(ne  “  1) 

The  ideal  substrate  material  suitable  for  this  ap¬ 
plication  should  have  large  birefringence,  to  ensure 
that  the  required  etch  depth  d  can  be  fabricated 
with  high  accuracy.  We  used  a  new  t3q)e  of  birefrin¬ 
gent  crystal,  yttrium  orthovanadate  (YVO4),  which 
has  large  birefringence  and  can  be  relatively  easily 
processed  with  microfabrication  techniques.  The  bire¬ 
fringent  substrates  are  x-cut  YVO4  grown  by  Castech- 
Phoenix,  Inc.  The  refractive  indices  are  tiq  =  2.0241 
and  Tie  =  2.2600  at  a  wavelength  of  0.5145  /xm.  Using 
Eqs.  (2)  and  (3)  with  these  values  of  refractive  indices, 
we  find  all  the  possible  combinations  of  <I>o  and  that 
are  necessary  for  construction  of  a  binary-phase-level 
SSBCGH  (see  Table  1).  In  Table  1,  di  and  dm  are 
the  exact  etch  depths  required  for  finding  the  desired 
phase  delays  with  integers  I  and  m.  We  observe  that 
the  approximation  errors  are  less  than  5%  for  all  cases 
except  one,  which  we  can  solve  by  taking  the  value 
d  as  the  weighted  average  of  di  and  dm  instead  of 
one  half  the  summation.  Some  other  optimizations, 
such  as  choosing  a  different  set  of  phase  quantiza¬ 
tion  bases,  may  also  reduce  the  approximation  er¬ 
rors.  Because  the  absolute  phase  in  diffractive  optics 
is  of  no  concern,  we  can  remove  an  etch-depth  bias  of 
1.013  fjLTTi  without  affecting  the  desired  relative  phase 
values  between  different  pixels.  The  new  real  values 
of  the  etch  depths  dr  are  also  listed  in  Table  1.  We 


can  also  observe  that  one  needs  only  two  distinct  etches 
to  construct  a  binary-phase-level  SSBCGH  (s  and  t  in 
Table  1). 

Construction  of  more  efficient  multiple-phase-level 
elements  is  also  possible.  Now  the  minimum  etch- 
depth  increments  are  determined  by  phase  combina¬ 
tions  Oo  =  0,  <I>e  =  277 /iV  and  Oq  =  277/iV’,  =  0, 

where  N  is  the  number  of  phase  quantization  levels. 
The  required  etch  depths  for  other  values  of  phase  lev¬ 
els  are  the  combination  of  these  basic  ones.  However, 
in  some  cases  these  values  may  result  in  unrealistically 
deep  etch-depth  requirements;  moreover,  the  number 
of  required  etches  is  more  than  that  for  a  conventional 
multiple-phase-level  DOE. 

To  understand  better  the  fabrication  accuracy 
requirements  and  their  effect  of  the  performance  of 
the  fabricated  SSBCGH  elements,  we  used  rigorous 
coupled-wave  analysis®  to  simulate  the  performances 
of  our  first  fabricated  SSBCGH  element.  Figure  2 
shows  the  simulation  results  for  diffraction  effi¬ 
ciencies  and  polarization  contrast  ratios  (PCR's)  as 
functions  of  etch  depth  for  a  grating  with  a  1^1  duty 
ratio  and  vertical  sidewalls.  From  the  simulation 
results  we  can  observe  that  good  performance  (—41% 
diffraction  efficiency  and  >100:1  PCR)  can  be  obtained 
from  the  geometrical  optics  design.  Also,  the  results 
of  the  rigorous  coupled-wave  analysis  show  that  the 
etch  depths  for  the  best  PCR  and  the  largest  diffraction 
efficiency  are  very  similar  but  not  identical  (— 1% 
difference).  This  important  result  implies  that  the 
desired  etch  depth  can  be  driven  by  the  application 
needs  and  may  differ  slightly  from  the  values  provided 
by  the  geometrical  optics  approximate  design  listed 
in  Table  1.  Additional  simulation  and  experimental 
evaluations  show  that  the  trapezoidal  shape  and  the 
uneven  duty  cycle  degrade  the  performance  of  the 
SSBCGH  significantly. 

Figure  3  shows  the  PCR  as  a  function  of  the  op¬ 
tical  wave  incidence  angle  for  different  grating- 
period-to-wavelength  ratios,  calculated  with  the  rig¬ 
orous  coupled-wave  analysis.  Grating  grooves  are 
perpendicular  to  the  incident  plane.  From  this  fig¬ 
ure  we  can  observe  that  multiple-order  phase-delay 
elements  are  sensitive  to  the  incidence  angle.  These 
curves  also  indicate  that  this  design  approach  is  valid 
only  for  large  grating-period-to-wavelength  ratios  be¬ 
cause,  when  the  grating  period  is  comparable  with  or 
smaller  than  the  wavelength  of  the  incident  optical 
field,  the  form-birefringence  effect  becomes  domi¬ 
nant.^  Within  this  region,  not  only  do  the  PCR’s  drop 
considerably  but,  in  addition,  most  of  the  incident  wave 
energy  propagates  into  the  zeroth  diffraction  order. 


Table  1.  Design  Results  and  the  Real  Etch  Depths  Required  for  a  Binary-Phase  SSBCGH  with  YVO4 


‘I’o,  <!>, 

/,  m 

d/,  d„,  {fjim) 

d  —  (di+  d„,)/2  (yxm) 

dr  —  d  -  1.013  (/xm) 

Error  (%'f 

4.  5 

2.010,  2.042 

2.0260 

1.0130  (-s) 

+3.27 

-3.9 

2,2 

1.005,  1.021 

1.0130 

0.0000 

+  1.63 

-3.8 

TT,  0 

2,3 

1.256,  1.225 

1.2406 

0.2276  i^t) 

-6.1 

+3.8 

77,  77 

4,  5 

2.261,  2.246 

2.2535 

1.2405  (-S  -f  t) 

-2.9 

+3.8 

"When  and  4^^  are  zero,  they  are  taken  to  be  2:7  for  error  evaluations. 
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(a) 


Fig.  2.  (a)  Diffraction  efficiencies  and  (b)  PCR’s  of  the 
SSBCGH. 


Fig.  3.  Simulated  PCR’s  as  functions  of  the  incident  angle 
and  the  grating-period-to-wavelength  ratio. 


For  experimental  demonstration  and  characteri¬ 
zation  of  the  SSBCGH,  we  constructed  a  diffractive 
polarization  beam  splitter  that  diffracts  one  po¬ 
larization  while  transmitting  the  other.  This  is  a 
special  case  of  the  dual-function  element.  First,  a 


40-yam-period  grating  pattern  defined  by  electron- 
beam  lithography  was  photolithographically  trans¬ 
ferred  into  a  1.7-/am-thick  photoresist.  The  resist  was 
spun  coated  onto  a  100-nm-thick  Cr  layer  evaporated 
onto  the  YVO4  substrate.  Then  the  surface-relief 
profile  was  ion  milled  into  the  YVO4  substrate  to 
1.032  fim.  In  this  multiple-phase-period  approach 
the  etch  depth  must  be  controlled  vdth  higher  accu¬ 
racy  than  that  used  for  construction  of  conventional 
DOE’S.  This  is  because  the  error  introduced  by  the 
over/under  etch  is  determined  not  by  the  ratio  of 
etch  error  to  the  total  etch  depth  but  by  the  ratio  of 
the  etch  error  to  a  fraction  of  the  total  etch  depth 
that  is,  in  effect,  responsible  for  encoding  the  desired 
phase  values  in  a  given  pixel.  The  duty  ratio  of  the 
fabricated  SSBCGH  element  was  measured  to  be  1:1. 
The  experimental  evaluations  of  the  element  show 
70.8%  diffraction  efficiency  and  79.7:1  PCR  into  the 
zero  order,  37.4%  diffraction  efficiency  and  33.0:1  PCR 
into  the  +1  order,  and  38.9%  diffraction  efficiency 
and  32.5:1  PCR  into  the  -1  order.  The  experimental 
performance  of  the  element  is  very  close  to  that 
predicted  by  the  rigorous  coupled-wave  analysis  (see 
Fig  2)  From  the  simulation  we  can  also  see  that  the 
performance  of  a  SSBCGH  can  be  further  improved 
with  more-accurate  etch  depths. 

In  conclusion,  we  designed,  fabricated,  and  experi¬ 
mentally  evaluated  polarization-selective  computer¬ 
generated  holograms,  using  a  single  birefringent 
substrate.  The  fabricated  elements  show  diffraction 
efficiencies  close  to  the  theoretical  limit  and  large 
polarization  contrast  ratios.  The  duty  ratio  and  the 
shape  of  the  grating  change  the  performance  of  the 
SSBCGH  and  need  to  be  controlled  accurately.  Such 
elements  may  be  useful  in  many  applications  such  as 
image  processing,  transparent  photonic  switching,  and 
the  packaging  of  optoelectronic  devices  and  systems. 

The  authors  thank  K.  Urquhart  and  P.  C.  Sun  for 
helpful  discussions.  This  research  is  funded  by  the 
National  Science  Foundation  and  the  U.S.  Air  Force 
Rome  Laboratory. 
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Polarization-selective  phase-only  birefnngent  cona- 
puter-generated  holograms  (BCGH’s)  are  general- 
purpose  diffractive  elements  that  have  independent 
impulse  responses  for  orthogonal  linear  polarizations. 
Such  elements  are  shown  to  be  useful  in  many  ap¬ 
plications,  including  packing  optoelectronic  devices 
or  systems,  free-space  optical  interconnects,  and  im¬ 
age  processing.*  BCGH’s  have  been  demonstrated 
with  two  birefringent  substrates^  ®  and  with  a  single 
birefringent  substrate.-*  The  birefringence  of  the 
substrates  in  these  configurations  makes  the  elements 
sensitive  to  the  polarization  of  the  light.  The  two- 
substrate  approach  is  complicated  to  fabricate  ^cause 
it  includes  an  assembly  process  of  the  two  diffractive 
structures  that  requires  high  alignment  accuracy. 
The  single-substrate  approach,  on  the  other  hand,  is 
simpler  to  fabricate,  but  it  is  only  an  approximate 
solution.  In  this  Letter  we  report  a  new  approach 
for  design  and  fabrication  of  BCGH  elemente.  Our 
new  approach  involves  creating  a  form-hirefringent 
nanostructure  and  modulation  of  the 
as  well  as  the  birefringence  at  each  pixel  of  the  BCutl. 

Form  birefrigence  is  a  well-known  effect  of  subwave- 
length  periodic  microstructures.  The  electric  fields 
parallel  to  the  grating  grooves  (TE  polarization)  and 
perpendicular  to  the  grating  grooves  (TM  polarization) 
need  to  satisfy  different  boundary  conditions,  ^suiting 
in  different  effective  refractive  indices  for  TE-  and 
TM-polarized  waves.®  Many  researchers  have  demon¬ 
strated  this  effect  in  the  far-IR  region.  ■  Recently, 
with  the  help  of  the  advances  in  nanofabric^ion, 
200-nm  period  gratings  were  fabricated  in  a  uaAs 
substrate  that  showed  strong  form  birefringence 
in  the  near  IR.®  Furthermore,  these  results  were 
found  to  be  in  agreement  with  the  numerical  simu¬ 
lation  results  obtained  by  a  rigorous  coupled-wave 
analysis  (RCWA).®*“  Design  optimizations  were  per¬ 
formed  for  BCGH  by  form  birefringence.  Chen  and 
Craighead  demonstrated  a  polarization-insensitive 
diffractive  optical  element  that  uses  two-dimensional 
subwavelength  periodic  microstructures.  Aoyama 
and  Yamashita  demonstrated  a  grating  beam  splitting 
polarizer  that  uses  a  subwavelength  grating  fabricated 
in  a  photoresist.’®  The  polarization  contrast  ratios. 


defined  as  the  ratio  of  intensities  obtained  under  two 
orthogonal  polarizations  at  the  designed  diffraction  or¬ 
der,  were  ~6: 1  and  ~  3;  1  for  the  0th  and  1st  diffraction 
orders,  respectively. 

In  what  follows,  we  report  the  design,  fabrication, 
and  experimental  evaluation  of  a  binary  phase  level 
BCGH  element  that  uses  form-birefrigent  nanostruc¬ 
tures  [or  form-birefringent  computer-generated  holo¬ 
grams  (FBCGH’s)]  fabricated  upon  GaAs  substr^s 
for  operation  in  the  near-IR  wavelength  range.  The 
FBCGH  element  is  designed  to  transmit  the  IE  polar¬ 
ization  straight  ahead  and  deflect  the  TM  polarization 

at  an  angle.  _ 

Consider  a  single  period  m  a  binary  phase  diffrac¬ 
tive  structure  as  shown  in  Fig.  1.  In  this  period  T, 
one  pixel  consists  of  a  high-spatial-frequency  grating 
(HSFG)  with  period  A,  and  the  other  pixel  is  the  sub¬ 
strate  material.  Of  the  two  periodic  structures,  the 
HSFG  does  not  introduce  propagating  diffraction  or¬ 
ders  other  than  the  0th  order  because  of  its  subwave¬ 
length  nature.  The  diffractive  structure,  on  the  other 
hand,  introduces  many  diffraction  orders.  T^e  phase 
differences  between  rays  1  and  2  for  TE  and  TM  polar¬ 
izations  are 

(2Tr/\){ns  -  nTE)d  =  d>TE . 

(2i7/A)(ns  -  nTu)d  =  ‘I’tm  , 

where  A  is  the  wavelength  in  vacuum,  d  is  the  thickness 
of  the  HSFG  layer,  n,  is  the  refractive  index  of  the 
substrate,  and  ute  and  htm  are  the  effective  refractive 
indices  of  the  HSFG  for  TE  and  TM  polarization, 
respectively.  When  the  wavelength  is  much  larger 

i  FA  I 

fk  ^  K _ . 


Fig.  1.  FBCGH  design:  one  period  in  a  FBCGH. 
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than  the  period  of  the  HSFG,  second-order  effective 
medium  theory^^  (EMT)  can  be  used  to  calculate  the 
effective  indices  with  high  accuracy.  The  effective 
indices  for  TE  and  TM  polarizations  calculated  with  a 
second-order  EMT  are  given  by 


n2TE  = 


nzTM  = 


2  .  1 
^OTE  +  “g 


2  1 
^OTM  + 


AttF 


-l2 


(1  -  F)(n2  -  nl) 


AnF  3 


X  (1  -  F) 


(i-ai 


y2 


V2 


(2) 


where 


noTE  =  [Fri^  +  (1  -  F)nl]^^, 


«o™ 


{nprif 

Fnl  +  (1  -  F)n2 


1/2 


are  the  effective  indices  calculated  with  zero-order 
EMT,  A  is  the  period  of  the  HSFG,  F  is  the  grating 
fill  factor  of  the  HSFG  (defined  as  the  ratio  between 
the  width  of  the  unetched  portion  within  one  period 
of  grating  to  the  grating  period  A;  see  Fig.  1),  and  n 
and  no  are  the  refractive  indices  of  the  two  materials 
that  form  the  HSFG.  We  chose  GaAs  as  the  substrate; 
therefore  n  =  n^  =  3.37  (index  of  GaAs  at  1.55  /xm)  and 
^0  =  ^air  =  1.  In  general,  nxE  >  ^tm-  To  design  a 
diffractive  polarization  beam  splitter,  we  implement  a 
binary  phase  grating  for  TE  polarization  (i.e.,  <I>'rE  = 
tt)  and  without  affecting  the  TM  polarization  (i.e,, 

<1>TM  =  277). 

Once  the  reconstruction  wavelength  A  is  chosen,  the 
period  of  the  HSFG,  A,  can  be  determined.  This  A 
should  be  large  enough  to  facilitate  the  fabrication  and 
small  enough  not  to  cause  higher  than  the  0th  propa¬ 
gating  diffraction  orders.  From  our  RCWA  simulation 
we  found  that 


A  <  k/us  (3) 

is  a  useful  criterion.^  Thus  we  only  need  to  find  the 
grating  fill  factor  F  and  the  etch  depth  d  to  design  the 
element.  First,  we  determine  F.  From  Eqs.  (1)  with 
our  design  parameters  Oje  =  and  Otm  =  277  we  have 


^  ^TE  _  d>TE 


(4) 


Substitute  wte  and  ^tm  from  Eqs.  (2)  into  Eq.  (4); 
choose  operating  wavelength  A  =  1.55  /xm  and  HSFG 
period  A  =  0.3  /xm.  By  solving  the  resultant  Eq.  (4) 
we  find  the  grating  fill  factor  F  =  0.3509.  With  this 
fill  factor  and  other  parameters,  the  corresponding 
effective  refractive  indices  from  Eqs.  (2)  are  found  to 
be 


n2TE  ==  2.309,  n2T.M  ^  1.2447 . 

Finally,  we  find  from  Eqs.  (1)  the  required  etch  depth 
of  the  HSFG,  d  =  0.728  /xm. 


To  ensure  the  accuracy  of  this  design  we  also 
simulate  the  phase  delay  introduced  by  the  HSFG, 
using  a  RCWA.^’^^  In  the  RCWA  a  single  period  of  a 
surface  relief  grating  is  divided  into  a  large  number 
of  planar  layers.  The  optical  fields  are  formulated  in 
terms  of  spatial  harmonics  by  Fourier  series  expan¬ 
sions  of  the  dielectric  constant  of  each  layer.  Bound¬ 
ary  conditions  are  matched  and  energy  conservation 
law  is  employed  to  solve  the  resultant  coupled  diffrac¬ 
tion  equations.  In  our  simulation  we  only  try  to  cal¬ 
culate  the  phase  delay  caused  by  HSFG  to  confirm  the 
results  that  we  obtained  by  using  the  EMT.  The 
actual  diffraction  efficiency  of  a  FBCGH  is  esti¬ 
mated  later  by  scalar  diffraction  theory.  The  grating 
parameters  are  the  same  as  those  given  above.  The 
simulation  indicates  that  the  phase  delay  introduced 
by  a  0.73-/xm-thick  HSFG  is  2.1547?  for  TE  polariza¬ 
tion  and  1.19077  for  TM.  A  GaAs  layer  of  the  same 
thickness  without  HSFG  introduces  3.178:7  phase 
delay.  Thus  the  designed  grating  will  have  a  1.024:? 
phase  difference  between  the  HSFG  pixel  and  an 
unetched  pixel  for  TE  polarization  and  1.987:?  for  TM 
polarization.  This  simulation  shows  the  validity  of 
our  design.  It  also  indicates  that  the  EMT,  if  used 
carefully,  can  be  used  in  designing  FBCGH  elements. 

Following  this  design,  we  fabricated  a  diffractive 
structure  upon  a  (lOO)-cut  GaAs  substrate,  using 
electron  beam  lithography  and  dry  etching  tech¬ 
niques.®  The  total  area  of  the  element  was  100  /xm  x 
100  /xm.  The  period  of  the  binary  phase  diffractive 
grating  T  was  10  /xm.  The  period  of  HSFG  A  was 
0.3  /xm,  and  the  fill  factor  of  the  HSFG  F  was  0.35. 
The  fabricated  element  has  an  etch  depth  of  0.75  /xm 
for  the  HSFG.  Figure  2  shows  a  scanning  electron 
micrograph  of  the  fabricated  element. 


Fig,  2.  Scanning  electron  micrograph  of  the  fabricated 
FBCGH. 


Fig.  3.  Schematic  of  the  e.\  peri  mental  evaluation  of  the 
fabricated  FBCGH. 
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-1st  Order 


Performance 

TE  efficiency 
TM  efficiency 
Polarization  contrast  ratio 


0th  Order 

0.86%  (0.0%) 
75.5%  (100%) 
88.2:1 


1st  Order 

41.4%  (40.5%) 
0.15%  (0%) 
275:1 


44.2%  (40.5%) 
0.44%  (0%) 
99.2:1 


roianzauon  comra&t  iciwu  _ _ _ _ _ _ _ _ 
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comparison. 


We  evaluated  the  fabricated  element  with  a  He- 
Ne  laser  (Melles  Griot)  operating  at  1.523  ^m,  using 
the  setup  shown  schematically  in  Fig.  3.  The 
was  focused  onto  the  FBCGH  by  a  low-power  (6x) 
microscope  objective  (MO).  A  Ge  detector  was  used  to 
measure  the  far-field  diffraction  patterns. 
ization  state  of  the  beam  incident  upon  the  FBCGH  was 
controlled  with  a  polarization  rotator  (Pol.  Rot.). 

In  the  binary  phase  FBCGH  reconstruction  stage  we 
anticipate  observing  +1  and  —1  propagating  diffrac¬ 
tion  orders.  In  our  characterization  experiments  we 
observed  only  two  spots  on  the  IR  phosphor  ^ewmg 
card  under  TE  polarization  and  one  spot  under  TM 
polarization,  although  higher  orders  do  exist  and  can 
be  detected  with  a  photodetector.  We  can  optimize 
the  distance  between  the  microscope  objective  and 
the  FBCGH  by  minimizing  the  measured  energy  difj 
fracted  into  the  0th  diffraction  order  at  TE-polanzed 
illumination.  The  measured  diffraction  efficiency, 
excluding  reflection,  and  the  polarization  contra^ 
ratios  are  summarized  in  Table  1.  The  diffraction  ef¬ 
ficiency  of  Table  1  was  calculated  as  the  ratio  between 
the  intensity  measured  at  a  certain  diffraction  order 
and  that  of  the  total  light  transmitted  through  the 
GaAs  substrate  without  a  FBCGH.  These  measured 
results  show  that  the  FBCGH  has  good  polarization  se¬ 
lectivity  (large  polarization  contrast  ratios)  ^d  ditirac- 
tion  efficiencies  close  to  the  theoretical  limit.  Note 
that  the  form-birefringent  structure  also  serves  as  an 
antireflection  coating,  explaining  the  slightly 
measured  diffraction  efficiencies  compared  with  that 
predicted  by  scalar  diffraction  theory  for  a  binary 
phase  element  (40.5%).  The  expected  results  calcu¬ 
lated  with  scalar  diffraction  theory  are  also  listed 
in  the  table  for  comparison.  The  slight  asyrnmetry 
between  the  efficiencies  of  ±lst  diffraction  orders  is 
due  to  imperfect  normal  incidence. 

In  conclusion,  we  have  designed,  fabricated,  and 
evaluated  a  polarization-selective  computer-generated 
hologram  that  uses  form-birefringent  nanostructures 
upon  GaAs  substrates.  The  element  was  designed 
by  use  of  effective-medium  theory  and  verified  to 


be  valid  by  the  rigorous  vector  field  theory.  The 
design  and  the  experimental  evaluations  were  found 
to  be  in  good  agreement.  The  fabricated  element 
shows  a  large  polarization  contrast  ratio  (as  la^e 
as  275:1)  and  high  diffraction  efficiencies  (>40%  for 
the  first  diffraction  orders).  Such  an  element  may 
be  useful  in  fabrication  of  compact  and  efficient  tree- 
space  transparent  photonic  switching  fabrics  as  well  as 
packaging  optoelectronic  devices  and  systems. 

The  research  conducted  at  the  University  of  Califor¬ 
nia,  San  Diego,  is  funded  by  the  National  Science  Foun¬ 
dation  U  S  Air  Force  Office  of  Scientific  Research,  and 
Rome  Laboratories  and  that  at  Caltech  is  funded  by  the 
National  Science  Foundation. 
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reflective  for  TE  polarization  at  an  *^th  polarization  extinction  ratios  higher 
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than  220:1  at  a  wavelength  of  1.523  over  a  20°  angular  bandw,d^ 
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1.  INTRODUCTION 

Numerous  optical  information  processing  and  imaging 
systems  employ  different  polarization  states  to  increase 
the  information  bandwidth  and  to  reduce  the  cross  talk 
between  different  channels.  Some  application  examples 
include  free-space  optical  switching  networks,  '  read- 
write  magneto-optical  data  storage  systems,  ’  and 
polarization-based  imaging  systems.  •  In  these  systems, 
a  polarizing  beam  splitter  (PBS)  is  an  essential  element 
for  separating  two  orthogonally  polarized  light  beams. 
Most  of  the  applications  require  that  the  PBS  providing 
high  extinction  ratios  tolerate  a  wide  angular  bandwidth 
for  high-resolution  imaging,  a  broad  wavelength  range  for 
operation  with  broadband  sources,  and  a  compact  size  for 
effective  packaging.  Conventional  PBS  based  on  either 
natural  crystal  birefringence  (e.g.,  Wollaston  prisms)  or 
polarization  selectivity  of  multilayer  structures  (e.g.,  PBS 
cubes)  do  not  meet  these  requirements.  The  Wollaston 
prism  requires  a  large  thickness  to  generate  enough 
walk-off  distance  between  the  two  orthogonal  polariza¬ 
tions  owing  to  the  intrinsically  small  birefringence  of 
naturally  anisotropic  materials.  An  alternative  design  of 
a  Wollaston-type  prism  takes  advantage  of  form- 
birefringent  materials  that  possess  birefringence  several 
times  larger  than  that  of  natural  birefringent  materials, 
reducing  the  thickness  considerably.®  However,  the  fab¬ 
rication  of  such  a  structure  is  a  tedious  and  complicated 


process,  since  thousands  of  layers  of  thin-film  slab  need  to 
be  fabricated.  PBS  cubes  are  easier  to  fabricate,  but  they 
provide  good  extinction  ratios  only  in  a  narrow  angular 
bandwidth  for  a  limited  spectral  range.®  Other  designs 
that  utilize  form-birefringent  high-spatial-frequency 
surface-relief  gratings^®'^®  and  a  single-layer-coated  di¬ 
electric  slab^®  have  been  proposed  to  reduce  the  size  of  the 
components,  to  solve  the  material  compatibility  problem, 
and  to  simplify  the  fabrication  process.  However,  they 
usually  suffer  from  low  efficiency,  low  extinction  ratio 
small  angular  bandwidth,  and  operation  in  a  limited 
wavelength  range. 

Previously^^  we  introduced  a  new  PBS  device  that  uses 
the  unique  properties  of  anisotropic  spectral  reflectivity 
(ASR)  characteristics  of  a  high-spatial-frequency 
multilayer  binary  grating.  The  ASR  mechanism  is  based 
on  combining  the  effects  of  the  form  birefringence  of  a 
high-spatial-frequency  grating  (i.e.,  grating  period  is 
much  smaller  than  the  wavelength  of  the  incident  field) 
with  the  resonant  reflectivity  of  a  multilayer  structure. 
With  our  approach,  the  angular  field  and  the  wavelength 
range  have  been  largely  increased  compared  with  conven¬ 
tional  PBS  devices.  Many  packaging  and  material  com¬ 
patibility  problems  have  also  been  resolved  with  this  new 
design.  Some  interesting  characteristics  of  the  element 
with  ASR  characteristics  cannot  be  found  in  a  conven¬ 
tional  PBS  component.  For  instance,  when  our  ASR  de- 
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vice  is  designed  to  operate  with  normally  incident  light,  it 
acts  as  a  highly  efficient  polarization-selective  mirror. 

In  this  paper  we  report  detailed  design,  fabrication, 
and  experimental  characterization  of  such  a  PBS  based 
on  ASR  properties.  In  Section  2  we  describe  the  principle 
of  the  ASR  effect  of  a  high-spatial-frequency  multilayer 
binary  grating.  We  employ  the  effective-medium  theory 
(EMT)^^  to  explain  intuitively  why  high-spatial-frequency 
multilayer  binary  gratings  possess  such  characteristics. 
In  Section  3  we  introduce  the  design  methodology  of  the 
PBS  that  employs  such  ASR  properties.  Here  rigorous 
coupled-wave  analysis  (RCWA)^®  tools  are  used  to 
optimize^^  the  design  of  the  PBS.  We  also  investigate 
the  angular  and  the  wavelength  dependence  of  the  ASR 
PBS,  In  Section  4  the  effects  of  various  fabrication  errors 
on  performances  of  the  PBS  are  studied.  In  Section  6  we 
discuss  the  fabrication  techniques  employed  to  make  our 
PBS,  and  we  present  experimental  characterization  re¬ 
sults.  We  evaluate  our  PBS  design  in  terms  of  polariza¬ 
tion  extinction  ratio  and  efficiency  for  operation  with 
waves  of  wide  angular  bandwidth.  We  also  compare  the 
experimental  results  with  the  numerical  predictions. 
The  summary  and  the  directions  for  future  research  are 
provided  in  Section  6. 

2.  PRINCIPLES  OF  ANISOTROPIC 
SPECTRAL  REFLECTIVITY 

For  reviewing  properties  of  wave  propagation  in  stratified 
media,  consider  a  multilayer  structure  formed  on  a  sub¬ 
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strate  by  deposition  of  alternating  layers  of  isotropic  di¬ 
electric  materials  with  high  and  low  indices  of  refraction 
rih  and  n/ ,  respectively.  Such  a  structure  exhibits  high 
reflectivity  in  a  wide  spectral  bandwidth.  When  the 
thickness  of  each  layer  corresponds  to  a  quarter-wave  op¬ 
tical  thickness  for  a  selected  wavelength,  a  high- 
reflectivity  spectral  band  will  roughly  center  at  that 
wavelength.  For  example,  Fig.  1(a)  shows  a  15-layer 
quarter- wave  structure  made  of  Si  and  Si02  with  refrac¬ 
tive  indices  of  3.48  and  1.44,^®  respectively,  for  an  inci¬ 
dent  wavelength  of  1.523  fim,  A  high-reflectivity  spec¬ 
tral  band  is  clearly  shown  in  Fig.  1(a)  near  that 
wavelength.  One  can  increase  the  reflectivity  of  such  a 
quarter-wave  layered  structure  by  increasing  the  value  of 
the  ratio  n^/ni  and  the  total  number  of  layers.^®  Fur¬ 
thermore,  larger  values  of  the  ratio  n^/n;  also  increase 
the  spectral  bandwidth  of  high  reflectivity. 

For  an  optical  field  at  a  normal  angle  of  incidence,  a 
multilayer  structure  that  is  made  of  isotropic  dielectric 
materials  presents  identical  reflectivity  spectra  for  any 
two  orthogonal  linear  polarizations.  This  occurs  because 
of  the  symmetry  of  the  structure  for  a  normally  incident 
wave.  Therefore  the  structure  cannot  be  used  to  sepa¬ 
rate  normally  incident  fields  by  polarization.  By  substi¬ 
tuting  one  of  the  isotropic  materials  that  are  used  to  form 
the  multilayer  structure  with  birefringent  materials,  we 
can  create  a  new  multilayer  structure  that  will  possess  re¬ 
flectivity  spectral  bands  centered  at  different  wave¬ 
lengths  for  the  two  orthogonal  polarizations  at  normal  in¬ 
cidence.  It  is  illustrated  in  Fig.  1(b)  that  once  one  of  the 


Incident  light 


Wavelength  (^m)  Wavelength  (nm) 

(a)  (h) 

Fig.  1.  (a)  Schematic  diagram  of  a  15-layer  quarter-wave  structure  constructed  of  two  isotropic  materials  (Si  and  Si02);  a  plot  of  its 

spectral  reflectivity  is  also  shown.  The  thickness  of  Si  and  Si02  layers  are  set  as  the  quarter-wave  optical  thickness  of  the  wavelength 
of  1.523  fim.  (b)  Same  as  (a),  with  Si  being  replaced  by  anisotropic  material,  LiNbOa.  The  thickness  of  the  LiNbOa  layer  is  set  as  the 
quarter-wave  optical  thickness  corresponding  to  the  refractiv'e  index  for  the  ordinary  wave. 
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isotropic  materials  (say,  Si)  is  replaced  by  an  anisotropic 
material  the  spectral  reflectivity  bands  are  separated  Un 
this  example  we  disregard  the  material  incompahbility 
problems  and  choose  LiNbOa  with  refractive  indices  of 
2  21  and  2.14  (Ref.  20)  for  ordinary  and  extraordin^ 
waves,  respectively].  We  call  these  phenomena  ASR. 
Unfortunately,  multilayer  structures  consisting  of  natu¬ 
ral  anisotropic  materials  cannot  easily  be  fabricated. 
Furthermore,  since  natural  anisotropic  materials  possess 
small  birefringence,  this  separation  ■will  be  small.  With 
our  approach  based  on  a  high-spatial-frequency  mulU- 
layer  binary  grating,  the  separation  between  reflertmty 
spectral  bands  can  increase  considerably  owing  to  the  ex¬ 
ceptionally  large  anisotropy^^  that  can  be  obtained  wnth  a 
form-birefringent  nanostructure.  .  ,  .  , 

Form-birefringence  effects^^  appear  in  high-spatia  - 
frequency  gratings  constructed  of  isotropic  dielec^c  ma¬ 
terials.  Because  of  the  geometric  asymmetry  of  the  grat¬ 
ing  structure,  the  two  orthogonally  polarized  optic^ 
fields,  one  parallel  to  the  grating  grooves  (called  the  TL 
field)  and  the  other  perpendicular  to  the  grating  grooves 
(called  the  TM  field),  encounter  different  boundary  condi¬ 
tions,  resulting  in  distinct  effective  indices  of  refractiom 
On  propagation  through  the  grating  structure,  the  TE 
and  the  TM  fields  will  acquire  a  relative  phase  difference 
similar  to  that  obtained  in  natural  anisotropic  materials 
This  similarity  takes  place  because  the  subwavelen^h 
structure  can  be  designed  such  that  only  the  zeroth  dif¬ 
fraction  order  will  propagate,  while  all  the  higher  diffrac¬ 
tion  orders  will  become  evanescent.  Such  a  high-spatial- 
frequency  grating  at  the  boundary  between  two  isotropic 
materials  can  be  seen  as  an  equivalent  thin  film  of  aniso¬ 
tropic  material.  ,  .  j  i 

For  normally  incident  TE-  and  TM-polanzed  optical 
waves,  the  effective  indices  of  refraction  of  a  surface-relief 
high-spatial-frequency  binary  grating  can  be  estimated 
from  the  second-order  EMT^®: 


.2  _  „.2l2 


2  _  1  _2e<2,,  ir^2(_i. 


(2)  ^  <0)^  -I-  -  -  -■P’)  - 2 

'‘tm  "tm  3^x1  \«iii 


where  F  is  the  duty  cycle  of  the  grating  defined  by  F 
=  w/A,  with  w  being  the  width  of  the  binary  grating  [see 
Fig.  2(a)];  A  is  the  grating  period;  X  is  the  wavelength 
of  the  incident  wave;  /ti  and  njii  are  the  indices  of  air 
and  the  grating  material,  respectively;  and  /Ite 
=  [Fniii*  +  (1  “  "  {"ni  ”1 /[^”i 

+  (1  -  Flnui^l}*'^  are  the  effective  indices  of  refraction 
for  TE  and  TM  waves,  respectively,  provided  by  the  zero- 
order  EMT.^^  When  a  high-spatial-frequency  grating  is 
formed  in  a  multilayer  structure  made  of  two  isotropic 
materials,  the  composition  becomes  an  artifidal  aniso¬ 
tropic  multilayer  structure  that  will  possess  ASR  charac¬ 
teristics.  One  can  fabricate  such  an  element  by  etching  a 
high-spatial-frequency  binary  grating  directly  into  a 
multilayer  mirror  structure. 
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Figure  2(a)  shows  an  example  of  a  high-spatial- 
frequency  multilayer  binary  grating.  The  two  isotropic 
materials  used  for  constructing  the  multilayer  structures 
are  SiOj  and  Si,  with  refractive  indices  of  1.44  and  3.4S, 
respectively,  for  an  operating  wavelength  of  1.523 
The  SiOo  and  Si  materials  are  chosen  because  of  their  tab- 
rication  compatibility  and  because  of  low  absorption  coef¬ 
ficients  in  the  near-infrared  region,  which  ensure  a  low 
insertion  loss  of  the  device.  For  operation  as  a  fom- 
ijirafringent  zeroth-diffraction-order  grating,  we  set  e 
grating  period  to  be  equal  to  0.6  ^m,  with  the  duty  ^cle  of 
^  .  _ j  T?Tv/rT  rPnc  and  (2)1.  we 


Fic  2  (a)  Schematic  diagram  of  a  seven-layer  ASR  PBS  de- 

simed  for  light  at  normal  incidence.  The  center  operating 
wavelength  is  1.523  fim.  The  design  parameters  are  indicated 
in  the  figure,  (b)  EMT  and  (c)  RCWA  results  of  the  reflectivity 
for  TE-  and  TM-polarized  waves  versus  wavelength. 
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obtain  the  following  effective  refractive  indices  for  the  two 
materials;  ^  =  3.24,  nre.aOj  =  and  gi 
=  1.72,  gjQ  =  1,18.  The  effective  indices  of  both 
materials  used  in  the  multilayer  structure  are  larger  for 
TE  polarization  than  for  TM  polarization.  This  indicates 
that  in  the  spectral  domain  the  reflectivity  band  for  TE 
polarization  will  be  centered  at  a  longer  wavelength  as 
compared  with  that  for  TM  polarization,  i.e.,  we  vnW 
clearly  observe  a  large  ASR  effect.  The  separation  be¬ 
tween  the  two  reflectivity  spectral  bands  has  been  dra¬ 
matically  increased  over  that  possible  with  natural  aniso¬ 
tropic  materials. 

The  ASR  characteristic  of  a  multilayer  form- 
birefringent  binary  grating  is  the  essential  property 
needed  to  realize  the  ASR  PBS.  By  using  the  theory  of 
optical  wave  propagation  in  a  stratified  medium^^  that  is 
characterized  by  second-order  EMT,  we  show  that  the 
ASR  property  can  be  used  to  create  a  compact  PBS.  Fig¬ 
ure  2(b)  shows  the  spectral  reflectivities  of  the  seven- 
layer  grating  illustrated  in  Fig.  2(a).  It  can  clearly  be 
seen  that  near  a  wavelength  of  1.523  fim  the  grating  is 
transparent  for  TM  polarization  and  reflective  for  TE  po¬ 
larization.  Therefore  the  two  orthogonal  polarization 
components  of  the  light  wave  can  be  separated  to  propa¬ 
gate  in  opposite  directions  with  efficiencies  close  to  100%. 


3.  DESIGN  AND  MODELING  OF  THE 
POLARIZING  BEAM  SPLITTER 

In  Section  2  we  used  second-order  EMT  for  an  intuitive 
explanation  of  the  ASR  characteristic  of  high-spatial- 
frequency  gratings.  However,  previous  studies^^’^"^*^^ 
have  shown  that,  to  accurately  model  devices  in  which  the 
grating  period  is  compatible  with  the  wavelength,  a  rigor¬ 
ous  method  must  be  employed.  In  our  case  the  grating 
period  is  approximately  one  third  of  a  wavelength.  Our 
modeling  method  is  to  use  EMT  results  as  an  initial  esti¬ 
mate  of  device  parameters  and  then  use  a  rigorous 
method  to  modify  the  design  for  optimal  performance. 
The  modeling  tool  that  we  used  is  RCWA,  which  we  have 
experimentally  verified  for  its  accuracy  in  modeling  of 
surface-relief-t3q}e  grating.^^’^® 

The  design  procedure  is  described  by  the  following  two 
steps:  First,  we  use  second-order  EMT  to  calculate  the 
effective  indices  of  TE-  and  TM-polarized  light  in  each 
layer  of  the  grating  at  the  desired  center  operating  wave¬ 
length.  The  grating  period  is  set  to  be  small  enough  that 
only  the  zeroth  diffraction  order  can  propagate.  From 
our  previous  study^^  we  know  that  a  high-spatial- 
frequency  grating  can  possess  the  largest  effective-index 
difference  for  waves  with  orthogonal  polarizations  when 
the  duty  cycle  is  near  0.5.  Therefore  we  set  the  duty 
cycle  F  =  0.5  as  an  initial  value.  An  interesting  charac¬ 
teristic  is  that  the  value  of  the  effective-index  ratios  for 
TE-polarized  light  is  larger  than  that  for  TM-polarized 
light.  For  example,  effective-index  ratios  of  the  grating 
illustrated  in  Fig.  2(a),  (u/,  /n/)TE  and  (n/,  /n/)^^  >  are  2.59 
and  1.46,  respectively.  This  indicates  that,  to  achieve  the 
same  reflectivity,  the  number  of  layers  required  for  TE 
polarization  will  be  less  than  that  required  for  TM  polar¬ 
ization.  To  minimize  the  number  of  layers  needed  to 


achieve  a  desired  performance,  we  choose  to  maximize  re¬ 
flectivity  for  TE-polarized  light. 

In  the  next  step  we  allow  each  layer  to  have  a  quarter- 
wave  optical  thickness  based  on  the  effective  index  for 
TE-polarized  light.  These  parameters  are  now  used  as 
the  basis  for  an  optimum  design  by  RCWA.  The  thick¬ 
ness  and  the  duty  cycle  of  a  high-spatial-frequency  grat¬ 
ing  have  been  shown^*^  to  be  the  most  important  param¬ 
eters  that  will  affect  the  phase  difference  between  the  two 
orthogonally  polarized  waves.  Optimization  is  per¬ 
formed  by  incremental  variation  of  the  thickness  of  the 
layers  of  the  grating  to  obtain  the  highest  extinction  ratio 
at  the  operational  wavelength.  To  achieve  broad  reflec¬ 
tivity  spectral  bands,  we  use  high-refractive-index  mate¬ 
rials  for  both  the  first  and  the  last  layers  in  the 
structure.^^ 

Figure  2(c)  shows  the  RCWA  results  of  TE  and  TM  re¬ 
flectivities  as  a  function  of  the  wavelength  for  the  seven- 
layer  ASR  PBS  shown  in  Fig.  2(a).  The  PBS  is  designed 
for  normally  incident  optical  fields  at  a  center  operating 
wavelength  of  1.523  /xm.  As  expected,  the  reflectivity  of 
TE-polarized  light  is  higher  than  that  of  TM-polarized 
light.  The  TE  reflectivity  spectral  band  is  broader  and  is 
centered  at  a  longer  wavelength  than  is  that  of  the  TM 
wave.  This  ASR  property  cannot  be  accomplished  with 
an  isotropic  multilayer  structure  for  a  normally  incident 
optical  field.  Since  this  PBS  is  made  of  nonabsorbing 
materials  (at  those  operating  wavelengths),  there  is  basi¬ 
cally  no  insertion  loss,  and  the  efficiency  is  nearly  100%. 
Looking  at  Fig.  2(c),  we  can  see  that  both  the  TE  reflec¬ 
tion  efficiency  and  the  TM  transmission  efficiency  are 
higher  than  99%  at  the  operating  wavelength  of  1.523 
/xm.  Also,  the  transmission  polarization  extinction  ratios 
(defined  as  the  ratio  of  transmittance  of  TM-polarized 
light  to  that  of  TE-polarized  light)  remain  high  over  a 
broad  spectral  range  (>200:1  over  a  200-nm  range). 
However,  the  reflection  polarization  extinction  ratios  (de¬ 
fined  as  the  ratio  of  reflectance  of  TE-polarized  light  to 
that  of  TM-polarized  light)  are  extremely  high  over  a 
small  spectral  range  (>1000:1  over  a  20-nm  range). 
These  unique  features  can  be  employed  in  constructing  ei¬ 
ther  broadband  low-insertion-loss  normal-incidence  po¬ 
larizers  or  highly  efficient  polarization-selective  mirrors 
for  microlaser  cavities.  Other  ASR  PBS  designs  follow¬ 
ing  the  design  procedure  mentioned  above  for  a  center  op¬ 
erating  wavelength  of  1.3  /xm  can  be  found  in  our  previ¬ 
ous  paper. 

To  separate  the  path  of  the  reflected  wave  from  that  of 
the  incident  wave,  we  also  investigate  an  off-axis  geom¬ 
etry  [Fig.  3(a)].  Figure  3(b)  shows  the  numerical  results 
of  the  reflectivity  versus  the  wavelength  of  the  slanted  in¬ 
cidence  optical  wave  from  a  five-layer  grating.  For  an  in¬ 
cident  wavelength  of  1.523  /xm,  both  the  TE  reflection  ef¬ 
ficiency  and  the  TM  transmission  efficiency  are  higher 
than  99%,  and  the  polarization  extinction  ratios  for  reflec¬ 
tion  and  transmission  are  better  than  800:1  and  300:1,  re¬ 
spectively.  This  slanted  incidence  arrangement  offers 
two  advantages:  (1)  Reflectivity  from  each  layer  for  TE 
polarization  is  increased;  thus  only  five  layers  were 
needed  to  achieve  performance  similar  to  those  of  the 
seven-layer  design  for  normally  incident  light;  and  (2)  the 
sidelobe  for  the  TM  reflectivity  is  flattened,  allowing  op- 
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eration  of  the  beam  splitter  in  an  eveii  broader  spectral 
range  One  can  further  improve  the  efficiencies  and  the 
extinction  ratios  of  the  PBS  by  fine  tuning  the  grating  pe¬ 
riod  and  the  duty  cycle  or  by  adding  more  layers  to  the 

Various  applications  require  that  the  optical  comp  - 
nents  in  the  system  have  abilities  to  maintain  good  per- 
formance  within  a  large  two-dimensional  (2D)  an^lar 
bandwidth  and  broad  spectral  range  of  light  waves.  Con¬ 
ventional  PBS’s  provide  good  extinction  ratios,  usually  m 
a  narrow  angular  bandwidth  for  a  limited  wavelength 
range.  Most  of  the  polarization-sensitive  diffractive  opti¬ 
cal  elements  also  lack  these  abilities.  Our  PBS,  however, 
is  shown  to  provide  good  performances  for  optical  signals 
that  have  a  wide  2D  angular  bandwidth,  as  wdl  as  a 
broad  spectral  range.  As  illustrated  in  Fig.  3(a),  the 
angles  of  incidence  are  varied  to  span  an  angular  band¬ 
width  of  ±10°  in  both  9  and  <l>  directions  defined  near  the 
initial  42°  bias  angle.  The  results  shown  in  Fig^3(c)  in¬ 
dicate  that,  at  a  wavelength  of  1.523  ^ 

tion  efficiencies  are  higher  than  99.2%  and  98%  m  the  5 

and  10°  angular  bandwidth  “^es,  respectively,  and  the 

TM  transmission  efficiencies  are  higher  than  99.6%  in 
both  these  angular  ranges.  The  Polarization  extinction 
ratios  [see  Fig.  3(d)]  are  better  than  400:1  a^d  200.1  for 
reflection  in  the  5°  and  the  10°  angular  bandwidth  cones, 
respectively.  For  the  transmission,  the  polarization  ex¬ 


tinction  ratios  are  smaller  but  still  better  than  130:1  and 
50:1  in  the  5°  and  the  10°  angular  bandwidth  cones,  re¬ 
spectively.  These  results  show  that  a  wide  2D  angular 
bandwidth,  as  well  as  a  broad  spectral  range,  of  operation 
is  possible  with  this  design.  Furthennore  the  property 
of  Wgh  and  uniform  efficiency  makes  the  ASR  PBS  suit¬ 
able  for  many  imaging  systems  applications. 

A  STUDIES  OF  FABRICATION 

tolerances  of  the  polarizing  beam 

SPLITTER 

In  this  section  we  investigate  the  effect  of  fabrication  er¬ 
rors  (e.g.,  etch  depth  error,  grating  profile  error,  and  duty 
cycle  error)  on  the  performance  of  the  ASR  PBS.  We  first 
investigate  the  effect  of  etch  depth  fabrication  error,  dis¬ 
tinguishing  between  the  two  types  of  errors:  an  under¬ 
etched  grating  [see  Fig.  4(a),  i.e.,  part  of  the  last  thin  Si 
layer  remains  on  the  SiOa  substrate],  and  an  overetched 
grating  [see  Fig.  4(b),  i.e.,  the  substrate  is  shghtly 
etched).  The  results  of  our  simulation  show  that  the  re¬ 
flectivity  for  TM  polarization  is  more  sensitive  to  the  un¬ 
deretching  error  than  is  that  for  TE  polarization  [see  Fig. 
4(a)].  The  remaining  thin  layer  of  Si  will  increase  the 
TM  reflectivity,  since  Si  is  a  dense  medium,  resulting  in  a 
large  reduction  in  the  TM  transmission  efficiency.  This 
also  decreases  the  reflection  polarization  extinction  ratio. 


J.  Opt.  Soc.  Am.  A/Vol.  14,  No.  7 /July  1997 


lyan  et  al. 


Under  etch 


ii 


t 


pm 


Over  cich 


Fig.  4.  Effects  of  (a)  underetching  and  (b)  overetching  fabrication  error  on  reflectivity  and  extinction  ratios  for  TE-  and  TM-polarized 
light. 


In  contrast,  Fig.  4(b)  shows  that  overetching  error  has  al¬ 
most  no  effect  on  the  efficiency  of  both  TE-  and  TM- 
polarized  waves.  We  conclude  that  an  accurate  etch 
depth  control  to  avoid  underetching  is  necessary  to  attain 
good  performance  from  these  devices. 

Another  possible  fabrication  error  is  the  duty  ratio  er¬ 
ror  of  the  high-spatial-frequency  grating.  We  use  the 
same  design  parameters  as  for  the  PBS  shown  in  Fig, 
3(a),  except  that  the  duty  cycle  F  is  varied  continuously 
from  0.3  to  0.6  [see  Fig.  5(a)].  The  numerical  results  in¬ 
dicate  that  the  reflectances  for  both  polarizations  stay  ap¬ 
proximately  the  same  for  the  different  duty  cycles  ranging 
from  0.3  to  0,55  [see  Fig.  5(b)].  The  abrupt  decrease  of 
the  TE  reflectance  at  the  duty  cycle  of  0.57  may  be  due  to 
the  energy  coupling  into  the  guided  waves  propagating 
along  the  multilayer  structure.  We  can  observe  from  Fig. 
5(c)  that,  when  the  duty  cycle  of  the  grating  is  approach¬ 
ing  0.54,  the  transmission  polarization  extinction  ratios 
are  briefly  increasing.  Since  the  polarization  extinction 
ratio  is  a  ratio  of  a  large  value  to  an  extremely  small 
value,  a  minor  variation  of  the  small  value  will  cause  an 
abrupt  change  of  the  ratio.  These  resonance  phenomena 
can  also  be  seen  in  the  other  figures  (see,  e.g.,  Fig.  4). 

The  last  fabrication  error  that  we  investigated  is  the 
shape  of  the  grating  profile.  Again,  we  used  the  same  ba¬ 
sic  design  shape  as  that  of  the  PBS  shown  in  Fig.  3(a). 


We  vary  the  shape  of  the  grating  profile  by  holding  the 
width  W  of  the  bottom  of  the  grating  constant  as  the  top  is 
varied  as  W  -  AW  to  form  a  symmetric  trapezoidal  pro¬ 
file,  as  illustrated  in  Fig.  6(a).  The  results  shown  in  Fig. 
6(b)  indicate  that  the  efficiencies  of  the  PBS  will  remain 
nearly  100%  over  a  fairly  large  grating  profile  error.  The 
polarization  extinction  ratios  [see  Fig.  6(c)]  slowly  de¬ 
crease  as  the  shape  changes  from  rectangular  to  trapezoi¬ 
dal. 

In  general,  the  ASR  PBS  are  sensitive  to  underetching 
fabrication  errors.  However,  these  elements  are  rela¬ 
tively  immune  to  the  effects  of  fabrication  errors  in  duty 
cycle,  grating  profile,  and  overetching.  This  indicates  a 
rather  large  fabrication  error  tolerance  of  the  ASR  PBS. 
In  Section  5  we  discuss  fabrication  and  experimental  veri¬ 
fication  of  such  ASR  PBS. 

5.  FABRICATION  AND  EXPERIMENTAL 
CHARACTERIZATION  OF  THE 
POLARIZING  BEAM  SPLITTER 

Fabrication  of  the  ASR  PBS  in  the  visible  spectral  range 
is  a  challenging  task  because  of  the  requirement  to  fabri¬ 
cate  a  grating  with  a  subwavelength  grating  period. 
However,  for  near-infrared  range  of  operation,  fabrication 
of  the  structure  is  practically  possible.  For  example,  in 
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our  design  the  total  grating  depth  of  five  layers  is  0.91 
urn,  with  a  grating  period  of  0.6  /ira  and  a  duty  cycle  of 
0.5,  resulting  in  a  grating  aspect  ratio  of  appronmately 
3:1,  which  is  within  the  fabrication  capabilities  of  modern 


microfabrication  technology. 

The  fabrication  procedures  of  the  ASR  PBS  are  shown 
schematically  in  Fig.  7.  First,  five  layers  of  Si  and 
Si02,  with  thickness  of  0.13  and  0.26  /xm,  respectively, 
were  sputtered  alternatively  onto  a  SiOa  substrate  to 
form  a  one-dimensional  (ID)  dielectric  mirror  base.  The 
thickness  of  each  layer  was  controlled  with  an  accuracy  of 
better  than  5  nm.  30-kV  high-voltage  e-beam  lithogra¬ 
phy  was  then  used  to  define  a  high-resolution  grating 
with  a  period  of  0.6  foa  and  a  duty  cycle  of  0-5  o^er  » 
square  area  of  50  Atm  X  50  fxm  on  a  0.4-Atm-thick  PMMA 
layer  that  was  thicker  than  the  most  conventional  c-beam 
lithography  resists  [see  Fig.  7(a)].  The  PMl^  pattern 
was  developed  for  60  s  in  a  3:7  mixture  of  cellusolve  and 
methanol,  resulting  in  a  structure  shown  in  Fig. 
terward,  0.1  au®  of  chrome  was  deposited  [see  Fig.  7(c)J, 
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Fig.  5.  (a)  Diagram  defining  duty  cycle  error.  Effect  of  duty 
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TM-polarized  light. 
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and  the  PMMA  pattern  was  lifted  off  in  acetone  to  gener¬ 
ate  the  mask  [see  Fig.  7(d)].  With  the  durable  chrome 
mask,  the  periodic  pattern  was  transferred  through  the 
Si02  and  the  Si  layers  by  reactive  ion  etching  with  C2F6 
and  NF3/CCI2F2,  respectively  [see  Fig.  7(e)].  The  etch 
rate  of  Si02  and  Si  was  25  and  100  nm/min,  respectively. 
Finally,  the  remained  chrome  mask  was  removed  by  wet 
etching  [see  Fig.  7(f)].  The  fabricated  PBS  was  inspected 
under  a  scanning  electron  microscope  (SEM).  The  SEM  s 
side  view  of  the  structure  is  shown  in  Fig.  8. 

The  fabricated  PBS  element  was  evaluated  experimen¬ 
tally  by  means  of  a  measurement  setup  shown  schemati¬ 
cally  in  Fig.  9.  We  used  a  polarized  He-Ne  laser  source 
(Melles  Griot)  operating  at  a  near-infrared  wavelength  of 
1.523  Aim  with  a  0.8-mW  maximum  output  power.  The 
laser  beam  is  focused  onto  the  50  Aim  x  50  Aim  aperture 
of  the  fabricated  element  with  a  low-power  (5X)  micro¬ 
scope  objective.  The  input  polarization  is  controlled  by  a 
polarization  rotator.  Two  Ge  photodetectors  (Newport 
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Fig.  7.  Schematic  diagram  describing  the  fabrication  proce¬ 
dures  of  ASR  PBS  on  SiOg  substrates.  PMMA,  poly(methyl 
methacrylate). 


Fig.  8.  SEM  photograph  of  the  fabricated  ASR  PBS.  The  PBS 
is  fabricated  on  a  Si02  substrate  consisting  of  a  multilayer  struc¬ 
ture  of  Si  and  SiOg  with  a  thickness  of  0.13  and  0.26  fim,  respec¬ 
tively.  The  grating  has  a  period  of  0.6  fim,  with  a  duty  cycle  of 
0.5. 


Model  818-IR)  were  used  to  measure  the  transmittance 
and  the  reflectance  simultaneously.  For  the  alignment 
purpose  we  also  used  another  He— Ne  laser  operating  at  a 
visible  wavelength  of  0.6328  fim.  The  axis  of  the  visible 
He-Ne  laser  was  aligned  to  coincide  with  the  optical  axis 
of  the  infrared  laser. 

The  measured  efficiencies  (i.e.,  the  TM  transmittance 
and  the  TE  reflectance)  and  the  measured  polarization  ex¬ 
tinction  ratios  versus  the  incidence  angles  varying  from 
32°  to  52°  are  shown  in  Figs.  10(b)  and  10(c),  respectively. 
The  incidence  angle  is  determined  by  the  angle  between 
the  incident  beam  and  the  normal  to  the  grating  surface. 
The  incident  beam  is  lying  in  the  plane  perpendicular  to 


the  grating  grooves  and  parallel  to  the  grating  vector.  In 
such  an  arrangement,  the  measured  ID  efficiency  and  po¬ 
larization  extinction  ratio  curves  correspond  to  those  de¬ 
termined  numerically  from  the  vertical  cross  sections  of 
the  2D  contour  plots  shown  in  Fig.  3.  The  reflectance  is 
measured  in  a  range  of  36-52°  because  of  the  practical 
constraints  in  the  components  used  in  our  experimental 
setup.  The  experimental  results  show  that  the  fabri¬ 
cated  ASR  PBS  retains  high-polarization  extinction  ratios 
over  a  large  angular  bandwidth  ( ±  10  degree)  from  the  de¬ 
signed  incidence  angle  of  42°.  The  measured  transmis¬ 
sion  polarization  extinction  ratios  are  higher  than  220:1, 
with  a  maximum  value  of  830:1.  For  the  reflection,  the 
polarization  extinction  ratios  are  smaller,  but  they  are 
still  better  than  40:1,  with  a  maximum  value  of  70:1. 
The  fabricated  ASR  PBS  also  has  very  high  efficiencies. 
The  measured  reflection  efficiency  for  the  TE-polarized 
light  and  transmission  efficiency  for  the  TM-polarized 
light  are  higher  than  99%  and  97%,  respectively.  The 
slightly  lower  measured  efficiency  for  the  TM-polarized 
light  may  occur  because  of  the  etch  depth  errors,  causing 
reductions  of  reflection  polarization  extinction  ratios. 
We  expect  that  more-accurate  control  of  fabrication  toler¬ 
ances  (i.e.,  etch  depth)  will  improve  both  the  efficiencies 
and  the  extinction  ratios. 

For  comparison  between  the  experimental  results  and 
the  numerical  design  predictions,  we  used  the  SEM  image 
(see  Fig.  8)  and  the  optical  microscope  observation  of  the 
fabricated  element  to  estimate  the  fabrication  tolerances 
of  our  sample.  The  sample  observed  under  the  optical 
microscope  shows  that  the  SiOa  substrate  areas  have  a 
light  brown  color,  indicating  that  the  last  Si  layer  was  un¬ 
deretched.  The  SEM  photograph  indicates  a  deviation  of 
the  grating  profile  from  exact  rectangular  shape.  For 
modeling  purposes  we  estimate  that  our  sample  is  6  nm 
underetched.  Consequently,  we  use  the  modified  grating 
profile  shown  in  Fig.  10(a).  Figures  10(b)  and  10(c)  show 
that  the  experimental  results  and  the  numerical  predic¬ 
tions  are  in  good  agreement.  Note  that  the  only  signifi¬ 
cant  deviation  between  modeling  and  experimental  re¬ 
sults  occurs  in  Fig.  10(c)  near  the  incidence  angle  of  46°. 
This  deviation  may  occur  because  of  the  limited  dynamic 
range  of  our  photodetectors  (i.e.,  a  dynamic  range  of  more 
than  10®  will  be  needed  to  resolve  this  resonance). 


Sample 


Fig.  9.  Schematic  diagram  of  the  experimental  setup  for  the 
characterization  of  the  fabricated  ASR  PBS.  M,  mirror;  PR,  po¬ 
larization  rotator;  BS,  beam  splitter;  MO,  microscope  objective; 
RS,  rotation  stage;  PD  1  and  PD2,  Ge  photodetectors.  The  trans¬ 
mittance  and  the  reflectance  are  measured  simultaneously  to  en¬ 
sure  accurate  comparison  of  the  extinction  ratios  and  efficiencies. 
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optimization  and  numerical  characterization  of  the  PBS. 
Numerical  modeling  has  also  shown  that  these  el^en  s 
can  tolerate  broad-spectrum  optical  radiation  with  good 
performance  in  a  wide  2D  angular  bandwidth  range.  We 
have  demonstrated  that  our  PBS  can  be  designed  to  adapt 
easily  to  different  applications,  such  as  highly  efficient 
polarization-selective  mirrors  for  vertical-cavity  microla- 
ser  and  broadband  low-insertion-loss  normal-incidence 
polarizers.  Other  ASR  PBS  designs  for  specified  spec- 
i^rum  ranges  can  be  attained  by  means  of  suitable  combi¬ 
nations  of  materials  to  construct  ™ultilayCT  ^rec- 
tures.  This  design  flexibility  shows  how  ASR  PBS  can 
address  the  needs  of  many  different  optoelectronic  pack- 
aeine  systems.  We  also  discussed  the  influence  of  differ¬ 
ent  fabrication  errors  on  the  performance  of  the  P^S  and 
observed  a  rather  large  fabrication  error  tolerance  of  the 

'^^rsample  of  the  ASR  PBS  designed  for  a  wavelength  of 
1.523  am  has  been  fabricated  and  experimentally  okarac- 
terized.  The  experimental  results  show  that  our  rbb 
provides  high  measured  polarization  extinction  ratios 
(maximum  value  of  830:1)  for  the  two  orthogonally  polar¬ 
ized  output  optical  waves.  The  device  has  also  been 
shown  to  have  the  capabilities  for  operating  with  optical 
signals  of  wide  angular  bandwidth  (±10»  near  the  desig¬ 
nated  incidence  angle  of  42»)  with  a  high  extinction  ratio 
0220:1)  and  high  efficiencies  (>97%).  Finafly,  the  ex¬ 
perimental  results  have  been  compared  with  the  numeri^- 
cal  predictions  and  have  been  found  in  good  agreement. 
Our  future  research  goals  are  directed  toward  enhancing 
the  fabrication  precision  and  toward  developing  new  AbK 
PBS  designs  with  suitable  materials  for  other  spectrum 
rsiii£f6> 

The  ASR  PBS  combines  such  unique  features  as  com¬ 
pactness,  compatibility  with  semiconductor  materials, 
negligible  insertion  losses,  polarization  selectivity  or 
light  at  normal  incidence,  high  polarization  extinction  ra¬ 
tios,  and  operation  with  waves  of  large  angular  band¬ 
width  and  from  a  broad  spectral  range.  These  character¬ 
istics  make  the  devices  desirable  for  use  in  image 
processing,  in  optical  interconnections,  and  in  many  other 
polarization  optics  applications. 


Incidence  angle  (degree) 

(c) 

Fie.  10.  Comparison  of  experimental  measurements  and  nu¬ 
merical  predictions  of  the  fabricated  ASR  PBS.  (a)  The  device 
structure  shown  in  Fig.  3(a)  has  been  modified  to 
fabricated  grating  profile  error  as  well  as  the  underetchmg  fab- 
rication  error,  (b)  Measured  and  calculated  efficiencies  for  TE- 
and  TM-polarized  waves,  (c)  measured  and  calculated  polariza¬ 
tion  extinction  ratios  in  transmission  and  reflection  versus  inci- 
dence  angle. 

6.  CONCLUSIONS 

We  described  the  design,  fabrication,  and  experimental 
evaluations  of  an  ASR  PBS.  This  novel  element  com¬ 
bines  the  form-birefringence  effect  of  a  high-spatia - 
frequency  grating  with  the  high  reflectance  of 
structures.  We  use  EMT  for  initial  design  and  RCWA  for 
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A  490-nm-deep  nanostructure  with  a  period  of  200  nm  was  fabricated  in  a  GaAs  substrate  by  use  of 
electron-beam  lithography  and  dry-etching  techniques.  The  form  birefringence  of  this  microstructure  was 
studied  numerically  with  rigorous  coupled-wave  analysis  and  compared  with  experimental  measurements  at 
a  wavelength  of  920  nm.  The  numerically  predicted  phase  retardation  of  163.3*  was  found  to  be  in  close 
agreement  with  the  experimentally  measured  result  of  162.5*,  thereby  verifying  the  validity  of  our  numerical 
modeling.  The  fabricated  microstructures  show  extremely  large  artificial  anisotropy  compared  with  that 
available  in  naturally  birefringent  materials  and  are  useful  for  numerous  polarization  optics  applications. 
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The  form-birefringence  or  artificial-birefringence 
effect  occurs  when  the  period  of  such  microstructures 
is  much  less  than  the  wavelength  of  the  incident  optical 
field  and  the  far  field  of  the  transmitted  radiation 
will  possess  only  zero-order  diffraction.  The  two 
prevalent  approaches  to  characterize  such  artificial 
dielectric  properties  of  the  microstructured  boundary 
use  the  effective  medium  theory^  and  rigorous  coupled- 
wave  analysis^’^  (RCWA).  In  this  study  we  choose 
to  use  RCWA  because  the  simpler  effective  medium 
theory  does  not  provide  accurate  results  when  the 
microstructured  grating  period  approaches  the 
wavelength  of  the  radiation.^'^  Form-birefringent 
nanostructures  (FBN’s)  have  several  unique  proper¬ 
ties^  that  make  them  superior  to  naturally  birefringent 
materials:  (i)  A  high  value  for  the  strength  of  form 
birefringence,  An/n,  can  be  obtained  by  the  selec¬ 
tion  of  substrate  dielectric  materials  with  a  large 
refractive-index  difference  (here  An  and  n  are  the  dif¬ 
ference  and  the  average  effective  indices  of  refraction, 
respectively,  for  the  two  orthogonal  polarizations); 
for  example,  a  high-spatial-frequency  surface-relief 
grating  of  rectangular  profile  on  a  GaAs  substrate 
provides  a  An/n  value  of  '^0.63,  which  is  much  larger 
than  those  found  for  naturally  birefringent  materials 
(e.g.,  for  calcite  the  value  of  An/n  is  —0.1).  (ii)  The 
magnitude  of  form  birefringence,  An,  can  be  adjusted 
by  variation  of  the  duty  ratio  as  well  as  of  the  shape 
of  the  microstructures.®  (iii)  FBN’s  can  be  used 
to  modify  the  reflection  properties  of  the  dielectric 
boundaries.®’®  Such  FBN’s  are  useful  for  constructing 
polarization-selective  beam  splitters®’^  and  general- 
purpose  polarization-selective  diffractive  optical 
elements  such  as  birefringent  computer-generated 
holograms®  (BCGH’s). 

A  BCGH  is  a  general-purpose  diffractive  optical 
element  that  has  two  independent  though  arbitrary 
impulse  responses  for  the  two  orthogonal  linear  polari¬ 
zations.  BCGH  elements  are  useful  in  various  appli¬ 


cations.®  In  its  original  design®  a  BCGH  consists  of 
two  surface-relief  substrates  with  at  least  one  of  them 
birefringent.  The  two  independent  etch  depths  of  the 
BCGH  element  provide  the  two  degrees  of  freedom  nec¬ 
essary  to  encode  the  two  independent  phase  functions. 
However,  the  BCGH  fabrication  process  can  be  sim¬ 
plified  by  use  of  a  single  FBN  made  of  an  isotropic 
substrate.  One  can  obtain  the  two  degrees  of  freedom 
necessary  for  construction  of  a  BCGH  by  varying,  for 
example,  the  duty  ratio  and  the  etch  depth  of  the  dielec¬ 
tric  nanostructures.  In  this  Letter  we  investigate  the 
fabrication  and  characterization  of  FBN’s  to  determine 
their  usefulness  for  construction  of  a  form-birefringent 
computer-generated  hologram. 

Fabrication  of  FBN’s  for  visible  and  near-infrared 
wavelength  regions  is  a  challenging  task.  In  the 
past,  artificial  birefringence  was  observed  experi¬ 
mentally  for  microstructures  with  a  relatively  large 
period  that  can  be  operated  in  the  microwave  or 
far-infrared  spectrum  range.®  For  visible  and  near- 
infrared  radiation-range  applications  the  artificial 
dielectrics  were  fabricated  with  a  stratified  multilayer 
structure^®  or  by  the  recording  of  interference  patterns 
of  two  coherent  light  beams  to  create  a  subwavelength 
grating  in  a  photoresist.^^  Neither  method  is  suitable 
for  our  BCGH  applications  since  the  former  creates 
the  form  birefringence  in  a  direction  perpendicular  to 
the  substrate  surface  and  the  latter  does  not  provide 
design  flexibility  in  terms  of  microstructure  shape  and 
the  values  of  the  dielectric  constants.  To  achieve  the 
design  and  the  fabrication  flexibility  required  by  a 
BCGH,  we  use  electron-beam  lithography  to  generate 
the  high-spatial-frequency  patterns. 

The  fabrication  procedures  of  FBN’s  in  GaAs 
substrates  are  shown  schematically  in  Fig.  1.  First, 
a  GaAs  substrate  was  coated  with  a  layer  of 
Si02,  then  a  layer  of  Au,  and  finally  a  layer  of 
high-molecular-weight  poly(methyl  methacrylate) 
(PMMA).  Electron-beam  lithography  with  a  30-kV 
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incident  beam  energy  was  used  to  define  the  high- 
resolution  linear  gratings  over  a  square  area  of 
100  yxm  X  100  fim  on  the  spun-on  70-nm-thick 
resist  layer.  The  PMMA  pattern  was  developed 
for  14  s  in  a  3:7  mixture  of  C2H5OCH7CH2OH  in 
CH3OH  and  then  was  transferred  onto  the  70-nm- 
thick  Au  layer  by  ion  milling  with  1500-V  Ar  ions. 
This  Au  layer  was  used  as  a  dry-etching  mask  to 
transfer  the  patterns  into  the  100-nm-thick  layer 
of  sputter-deposited  Si02  by  reactive-ion  etching. 
During  this  etching  process,  60-mTorr  C2F4  was 
used  as  the  reactive  gas,  and  a  300-V  bias  voltage 
was  applied  (50  W  of  rf  power)  at  an  etch  rate  of 
20  nm/min.  Then,  a  chemically  assisted  ion-beam 
etching  system  helped  to  etch  the  high-resolution 
nanostructure  to  the  desired  depth  in  the  GaAs 
by  using  an  Ar-ion  beam  assisted  with  Cl  2  re¬ 
active  gas.  Finally,  we  removed  the  Si02  mask 
by  immersing  the  sample  into  buffered  HF.  The 
490-nm-deep  nanostructure  with  a  period  of  200  nm 
fabricated  in  GaAs  substrate  was  inspected  under  a 
scanning-electron  microscope  (SEM).  The  top  view 
and  the  cross-sectional  view  of  the  fabricated  nano¬ 
structure  are  shown  in  Figs.  2(a)  and  2(b),  res¬ 
pectively. 

The  experimental  setup  for  characterization  of 
the  form  birefringence  of  fabricated  nanostructures 
is  shown  schematically  in  Fig.  3.  An  Ar ‘''-pumped 
Ti‘Sapphire  laser  was  operated  at  a  wavelength  of 
920  nm,  where  the  GaAs  substrate  is  transparent 
with  minimum  absorption.  The  polarization  of  the 
laser  beam  was  controlled  by  a  polarization  rotator  so 
that  the  normally  incident  optical  wave  was  polarized 
linearly  at  45°  with  respect  to  the  grooves’  direction. 
We  used  a  microscope  objective  to  focus  the  incident 
beam  onto  the  100  /im  X  100  /^m  microstructure 
pattern.  At  a  distance  of  1  m  from  the  sample  we 
inserted  a  1-cm-diameter  aperture  stop  and  a  polari¬ 
zation  analyzer  followed  by  a  photodetector.  The 
aperture  stop  was  introduced  to  avoid  contributions  of 
the  obliquely  incident  light  and  diffracted  field  from 
the  edges  of  the  sample,  thereby  ensuring  the  vali¬ 
dity  of  the  paraxial  approximation  necessary  for  our 
polarization  measurements.  A  Glan-Thompson-type 
polarizer  was  used  as  an  output  analyzer. 

For  the  experimental  characterization  of  the  FBN 
we  used  Jones  calculus.  Let  the  Jones  matrix  of  the 
form-birefringent  nanostructure  on  a  GaAs  substrate 
be  given  by 

7  =  r  “  ®  1  (1) 

[0  6  exp(7<;>s)  J’ 

where  a  and  b  are  the  amplitude  transmittances  of 
the  horizontally  and  vertically  polarized  light  (i.e., 
perpendicular  and  parallel  to  the  grooves’  direction), 
and  <^>s  is  the  phase  difference  between  them  on  propa¬ 
gation  through  the  FBN.  The  output  fields  for  these 
two  polarizations  can  be  formulated  as 
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represents  the  Jones  matrix  of  an  analyzer  that  is 
aligned  at  an  angle  6  with  respect  to  the  vertical 
direction,  and  the  input  field  is  45°  linearly  polarized. 
The  intensity  measured  at  the  detector  is  given  by 

/out  =  (a^  cos^  e  +  sin^  6  2ab  sin  26  cos  y  • 

(4) 


Figure  4  shows  a  typical  curve  of  measured  in¬ 
tensity  versus  the  orientation  angle  of  the  analyzer, 
6,  in  the  setup  of  Fig.  3.  The  two  curves  corre¬ 
spond  to  the  measurements  of  the  GaAs  substrate 
with  and  without  the  FBN.  We  curve  fitted  the 
measured  data  by  using  Eq.  (4)  (see  the  solid  and 
dashed  curves  in  Fig.  4),  which  yielded  the  resultant 
parameters  a  =  0.67,  b  =  0.57,  and  <i>s  =  162.5*.  Note 
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Fig.  1.  Schematic  of  the  procedures  for  the  fabrication  of 
form-birefringent  nanostructures  in  GaAs  substrates. 


Fig.  2.  SEM  photographs  of  the  fabricated  form- 
birefringent  nanostructure  in  a  GaAs  substrate:  (a)  top 
and  (b)  cross-sectional  views. 
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Fig.  3.  Schematic  of  the  experimental  setup  for  the  char¬ 
acterization  of  form-birefringent  nanostructures.  Pol. 
Rot.,  polarization  rotator;  MO,  microscope  objective. 
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Fig.  4.  Experimental  measurements  and  curve-fitted  re¬ 
sults  of  the  transmitted  intensity  versus  the  orientation  of 
the  analyzer  for  a  GaAs  substrate  with  and  without  the 
form-birefringent  high-spatial-frequency  grating  (HSFG). 


that  <}>s  is  the  positive  phase  difference  between 
the  fields  at  vertical  and  horizontal  polarizations 
because  the  effective  index  for  polarization  parallel 
to  the  grooves  of  the  nanostructure  is  larger  than 
that  for  the  perpendicular  polarization.^  From  Fig.  4 
we  can  also  observe  that  the  transmittance  from 
the  GaAs  substrate  with  the  FBN  is  larger  than 
that  from  the  GaAs  substrate  alone.  This  effect  is 
obtained  because  the  effective  indices  of  the  nanos¬ 
tructure  for  both  polarizations  are  smaller  than 
those  of  the  GaAs  substrate,  and  therefore  the  nano¬ 
structure  pattern  acts  as  an  antireflection  coating. 

The  parameters  a,  6,  and  were  also  calculated  nu¬ 
merically  vnth  RCWA  applied  to  the  measured  profile 
[Fig.  2(b)]  of  the  fabricated  FBN  in  GaAs  substrate. 
The  profile  is  described  by  a  trapezoid  shape,  with  its 
top  edge  being  5%  of  the  period  and  the  bottom  edge 
being  95%  of  its  period.  The  period  and  the  depth  of 
the  GaAs  nanostructure  were  estimated  from  the  SEM 
photographs  of  Fig.  2  to  be  200  and  490  nm,  respec¬ 
tively.  The  GaAs  substrate  is  500  /xm  thick,  with  a 
refractive  index  of  3.57  and  an  absorption  coefficient 
of  3.25  X  10  which  are  interpolated  from  the  data 
in  Ref.  12.  The  numerical  simulations  provide  param¬ 
eters  a  =  0.743,  b  =  0.714,  and  (^s  =  163.3®.  The 
computer-simulation  result  for  the  phase  difference  be¬ 
tween  the  two  orthogonal  polarizations  is  found  to  be 
in  very  good  agreement  (0.5%  difference)  with  the  mea¬ 
sured  results,  confirming  the  validity  of  our  RCWA- 
based  numerical  model.  We  anticipate  that  the  slight 
difference  in  the  amplitude  transmission  coefficients 
occurs  as  a  result  of  (i)  some  scattering  loss  on  the  sur¬ 
face  of  the  nanostructure,  (ii)  diffraction  scattering  on 
the  limiting  aperture  of  the  nanostructure,  (iii)  inac¬ 
curacy  in  the  calculated  absorption  coefficient  for  the 
GaAs  substrate,  and  (iv)  inaccuracy  in  the  assumed 
profile  and  depth. 

The  characteristics  of  the  fabricated  nanostructure 
shown  in  Fig.  4  indicate  that  it  will  be  possible  to 
obtain  a  relative  phase  retardation  of  tt  (e.g.,  a  half¬ 
wave  plate)  between  the  vertical  and  horizontal  polari¬ 


zations.  Note  that,  by  rotating  the  orientation  of  the 
periodic  nanostructure  on  the  GaAs  substrate  by  90®, 
we  will  obtain  the  negative  value  -tt  for  the  phase 
retardation  between  the  vertical  and  horizontal  po¬ 
larizations.  Therefore  by  controlling  the  orientation 
of  the  periodic  nanostructure  we  wi\]  be  able  to  ob¬ 
tain  a  total  range  of  phase  retardation  between  —tt 
and  TT,  which  will  be  sufficient  for  the  design  of  a 
binary-phase  single-substrate  BCGH.  Furthermore, 
this  phase-retardation  range  will  be  useful  for  encoding 
the  phase  difference  of  a  multiple-phase-level  BCGH, 
whereas  absolute  relative  phase  will  need  to  be  cor¬ 
rected  by  mean  of  other  methods. 

In  conclusion,  we  have  fabricated  a  490-nm  form- 
birefringent  nanostructure  with  a  period  of  200  nm  in 
a  GaAs  substrate.  Form  birefringence  of  the  nano¬ 
structure  was  studied  numerically  with  RCWA  and 
compared  with  experimental  measurements  at  a 
wavelength  of  920  nm.  The  theoretical  modeling 
used  the  grating  profile  measured  from  SEM  pho¬ 
tographs  of  these  nanostructures.  The  predicted 
phase  retardation  of  163.3®  is  found  to  be  in  close 
agreement  with  the  experimentally  measured  result 
of  162.5®.  Controlling  the  orientation  of  the  dielectric 
nanostructure  permits  us  to  obtain  a  phase  retardation 
varying  from  -tt  to  tt  for  the  two  orthogonal  linear 
polarizations.  The  fabricated  nanostructures  show 
extremely  large  artificial  anisotropy  compared  vnth 
that  available  in  naturally  birefringent  materials 
and  are  useful  not  only  for  single- substrate  form- 
birefringent  computer-generated  holograms^  but  also 
for  numerous  other  polarization  optics  applications. 

This  study  was  funded  by  the  National  Science  Foun¬ 
dation,  the  U.S.  Air  Force  Office  for  Scientific  Re¬ 
search,  the  Optical  Technology  Center  of  the  Advanced 
Research  Projects  Agency,  and  the  Rome  Laboratory. 
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ABSTRACT 

We  introduce  a  novel  polarizing  beam  splitter  that  uses  the  anisotropic  spectral 
reflectivity  (ASR)  characteristics  of  a  high  spatial  frequency  multilayer  binary  grating.  By 
combining  the  form  birefringence  effect  of  a  high  spatial  frequency  grating  with  the  resonant 
reflectivity  of  a  periodic  multilayer  structure,  the  ASR  characteristics  for  the  two  orthogonal 
linear  polarizations  are  obtained.  Such  ASR  effects  allow  us  to  design  an  optical  element  that  is 
transparent  for  TM  polarization  but  reflective  for  TE  polarization.  The  properties  of  the 
polarizing  beam  splitter  are  investigated  using  rigorous  coupled-wave  analysis.  The  design 
results  show  that  an  ASR  polarization  beam  splitter  can  provide  a  high  polarization  extinction 
ratio. for  optical  waves  from  a  wide  range  of  incident  angles  and  a  broad  optical  spectral 
bandwidth.  Such  ASR  polarizing  beam  splitters  are  uniquely  suitable  for  image  processing  and 
optical  interconnection  applications. 


Keywords  :  polarizing  beam  splitter,  high  spatial  frequency  binary  grating,  multilayer  structure, 
form  birefringence,  diffractive  optical  element,  optical  components. 

1.  INTRODUCTION 

Polarizing  beam  splitters  (PBS)  are  essential  components  for  numerous  optical 
information  processing  applications  such  as  free-space  optical  switching  networks',  read-write 
magneto-optic  data  storage  systems\  and  polarization  based  imaging  systems^  These 
applications  require  that  the  PBS  providing  high  extinction  ratios  tolerate  a  wide  angular 
bandwidth,  a  broad  wavelength  range  of  the  incident  waves,  and  compact  size  for  efficient 
packaging.  Conventional  PBS  employing  either  natural  crystal  birefringence  (e.g.,  Wollaston 
prisms)  or  polarization  selectivity  of  multilayer  structures  (e.g.,  PBS  cubes)  do  not  meet  these 
requirements.  The  Wollaston  prism  requires  a  large  thickness  to  generate  enough  walk-off 
distance  between  the  two  orthogonal  polarizations  due  to  intrinsically  small  birefringence  of  the 
naturally  anisotropic  materials.  An  alternative  design  of  Wollaston  prisms'*  reduces  the  thickness 
considerably  by  taking  advantages  of  periodic  multilayer  slab  structures  that  possess  form 
birefringence  which  is  several  times  larger  than  that  of  natural  birefringent  materials.  However, 
the  fabrication  of  such  a  multilayer  slab  structure  is  a  tedious  and  long  process.  PBS  cubes  are 
easier  to  fabricate,  but  they  provide  good  extinction  ratios  only  in  a  narrow  angular  bandwidth 
for  a  limited  wavelength  range^  Other  designs  which  utilize  form  birefringent  high  spatial 
frequency  surface  relief  gratings^  and  single-layer-coated  dielectric  slab’,  have  also  been 
proposed  to  reduce  the  size  of  the  components  and  to  simplify  the  fabrication  process,  however, 
they  also  suffer  from  low  extinction  ratio,  small  operating  angular  bandwidth,  and  limited 
wavelength  range. 
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In  this  manuscript,  we  introduce  a  new  PBS  device  that  uses  the  unique  properties  of 
anisotropic  spectral  reflectivity  (ASR)  characteristics  of  a  high  spatial  frequency  multilayer 
binary  grating.  The  new  ASR  mechanism  is  based  on  combining  the  effect  of  form  birefringence 
of  a  high  spatial  frequency  grating  (i.e.,  grating  period  is  much  less  than  the  wavelength  of  the 
incident  field)  with  the  resonant  reflectivity  of  a  multilayer  structure.  In  the  next  section  we  first 
describe  intuitively  the  principle  of  the  ASR  behavior  of  the  high  spatial  frequency  multilayer 
grating  using  Effective  Medium  Theory  (EMT)*.  Then  we  use  Rigorous  Couple-Wave  Analysis 
(RCWA)®’'°  tools  for  an  optimum  design  of  the  PBS,  where  the  EMT  results  are  used  as  an  initial 
estimate.  In  sections  3  and  4  respectively,  we  use  RCWA  to  design  the  ASR  polarizing  beam 
splitters  and  characterize  them  in  terms  of  polarization  extinction  ratios  for  operation  with  waves 
of  wide  angular  bandwidth  and  broad  wavelength  range.  The  results  demonstrate  extremely  high 
extinction  ratios  (e.g.,  1000000:1)  when  the  PBS  is  operated  at  a  specified  wavelength  and  angle 
of  incidence.  Furthermore,  good  average  extinction  ratios  (from  800:1  to  50:1)  can  be  obtained 
when  the  PBS  is  operated  for  waves  of  20^  angular  bandwidth  with  wavelength  ranging  from 
1300nm  to  ISOOnm.  The  conclusions  and  future  research  directions  are  discussed  in  section  5. 

2.  ANISOTROPIC  SPECTRAL  REFLECTIVITY 

Consider  a  multilayer  structure  formed  on  a  substrate  by  depositing  alternating  layers  of 
dielectric  materials  with  high  and  low  indices  of  refraction,  nh  and  ni  respectively.  Such  a 
structure  exhibits  high  reflectivity  in  a  wide  spectral  bandwidth,  particularly  when  the  thickness 
of  each  layer  corresponds  to  a  quarter-wave  optical  thickness  for  the  center  wavelength.  The 
reflectivity  of  the  quarter-wave  structure  can  be  increased  by  increasing  the  value  of  the  ratio 
nji/ni  and  the  number  of  layers  in  the  stack.  Larger  values  of  nh/ni  also  increase  the  spectral 
bandwidth  of  high  reflectance.  For  a  multilayer  structure  made  of  isotropic  dielectric  materials, 
the  reflectivity  spectrums  for  the  two  orthogonal  linear  polarizations  are  identical  and  therefore, 
hardly  separable.  To  separate  the  reflectivity  spectrums  for  the  two  orthogonal  linear 
polarizations,  we  need  to  substitute  one  or  both  (i.e.,  high  and  low  refractive  indices)  materials 
with  birefringent  materials.  Such  a  multilayer  structure  of  anisotropic  materials  will  possess 
reflectivity  spectrum  bands  centered  at  different  wavelengths  for  the  two  orthogonal 
polarizations,  thereby,  providing  the  desired  separation  of  reflectivity  spectrums.  However, 
since  natural  materials  possess  very  small  birefringence,  the  separation  of  the  reflection  spectral 
bands  corresponding  to  the  two  orthogonal  polarizations  will  be  very  limited.  With  our  approach 
the  separation  of  the  reflection  spectral  bands  for  the  two  orthogonal  linear  polarizations  can  be 
considerably  increased  due  to  the  high  anisotropy  that  can  be  obtained  with  form  birefringence. 

Form  birefringence  effects'^  appear  in  high  spatial  frequency  gratings  formed  by  isotropic 
dielectric  materials.  Due  to  the  geometric  anisotropy  of  the  grating  structure,  the  two 
orthogonally  polarized  optical  fields,  one  parallel  to  the  grating  grooves  (designated  as  TE  field) 
and  the  other  perpendicular  to  the  grating  grooves  (designated  as  TM  field),  encounter  different 
effective  dielectric  constants  and  thus  acquire  a  phase  difference  between  them.  This  is  similar  to 
that  obtained  in  natural  anisotropic  materials.  The  magnitude  of  form  birefringence  depends  on 
the  geometric  composition  of  the  grating  structure  (including  the  dielectric  indices  and  the  shape 
of  the  grating)'®  as  well  as  the  angle  between  the  incident  field  and  the  grating  vector.  It  is 
important  to  note  that  the  value  of  form  birefringence  is  a  few  times  larger  than  the  birefringence 
obtained  with  naturally  birefringent  materials.  This  makes  the  high  spatial  frequency  grating  an 
excellent  candidate  for  separating  the  reflectivity  spectrums  for  the  two  orthogonal  polarizations. 
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Under  the  normal  incidence,  the  effective  indices  for  the  TE  and  TM  polarizations  of  a  surface- 
relief  high  spatial  frequency  binary  grating  can  be  estimated  from  the  2nd  order  EMT^ : 


(1) 


(2) 


where  F  is  the  duty  cycle  of  the  grating  defined  by  F=1  -  a  /  A  with  a  being  the  width  of  air  gap 
in  the  grating  (see  Fig.l),  A  is  the  grating  period,  X  is  the  wavelength  of  the  incident  wave, 
rijSind  are  the  indices  of  air  and  the  grating  material,  respectively,  and 

=  [Em/// +  (l-F)n/j  and  "'■(1  effective  indices  of 

refraction  for  TE  and  TM  waves  provided  by  the  zero  order  EMT. 

Figure  la  shows  an  example  of  a  high  spatial  frequency  multilayer  binary  grating.  We  use 
Si02  and  Si,  with  refractive  indices  of  1.45‘^  and  S-Sl’"*  respectively  (for  a  wavelength  of  1.3|j.m  ) 
as  the  two  materials  for  the  multilayer  structures  because  of  their  fabrication  compatibility  and 
low  absorption  coefficients  in  the  near  infrared  region  (this  results  in  a  low  insertion  loss).  For 
operation  of  the  form  birefringent  grating  in  the  zero  diffraction  order  we  set  the  grating  period 
equal  to  0.5  |im  and  the  duty  cycle  F  =  0.5.  Using  second  order  EMT  (eqs.l  and  2)  we  obtain  the 
following  effective  refractive  indices  for  the  two  materials  3-25,  n^E^sio^  ~ 

1.71,  ri^M.si02  ~  effective  indices  of  both  materials  are  larger  for  TE  polarization  than  for 

TM  polarization.  This  means  that  in  the  spectral  domain,  the  reflection  band  for  TE  polarization 
will  be  centered  at  a  longer  wavelength  as  compared  with  that  for  TM  polarization.  This  ASR 
characteristic  is  the  essential  property  needed  to  realize  the  PBS. 


Fig.  1.  (a)  Schematic  diagram  of  an  ASR  polarizing  beam  splitter  operated  with  plane  waves  at 

normal  incidence,  (b)  Numeric  results  of  the  reflectivity  for  TE  and  TM  polarized  waves  vs. 
wavelength  of  a  7-layer  PBS  designed  for  normally  incident  waves. 
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Another  characteristic  is  that  the  value  of  the  effective  index  ratio  for  TE  polarized  light 
((nh/ni)TE=2.58)  is  larger  th^  for  TM  polarized  light  ((nh/ni)TM=l-45).  This  indicates  that  to 
achieve  the  same  reflectance,  the  number  of  layers  required  by  TE  polarization  will  be  less  than 
that  required  by  TM  polarization.  To  minimize  the  number  of  layers  needed  to  achieve  a  desired 
performance,  we  choose  to  maximize  reflectivity  for  TE  polarized  light.  Therefore,  each  layer  has 
a  quarter-wave  optical  thickness  based  upon  the  TE  effective  index.  These  values,  estimated  by 
EMT,  are  used  as  the  basis  for  a  more  accurate  design  using  RCWA.  Optimization  is  done  by 
incrementally  varying  the  thickness  of  the  layers  to  obtain  the  highest  extinction  ratio  at  the 
operational  wavelength  of  1.3  |xm.  To  achieve  broad  reflectance  peaks  in  the  spectrum,  we  use 
high  refractive  index  materials  for  both  the  first  and  the  last  layers  in  the  structure.  Figure  lb 
shows  the  numeric  results  of  TE  and  TM  reflectances  as  a  function  of  the  wavelength  for  a  7-layer 
high  spatial  frequency  binary  grating  for  normally  incident  optical  fields.  As  expected,  the 
reflectance  peaks  for  TE  and  TM  polarized  light  are  separated  and  the  TE  polarization  has  a  higher 
reflectance  and  broader  bandwidth  at  longer  wavelengths  than  the  TM  polarization.  For  the  design 
wavelength  of  1.3  p,m,  the  TM  polarized  light  will  be  transmitted  while  the  TE  polarized  light  will 
experience  high  reflectivity  from  the  grating.  The  curves  also  show  that  the  polarization  extinction 
ratio  remains  high  over  a  wide  spectral  range  for  the  TM  polarization.  This  feature  allows  the 
element  to  function  as  a  low  insertion  loss  polarizer.  In  fact,  the  sidelobe  of  the  TM  reflection  is 
the  only  limit  for  the  design  of  the  polarization  beam  splitter.  We  anticipate  that  amplitude  of  the 
sidelobes  may  be  reduced  by  fine-tuning  other  design  parameters,  such  as  the  grating  duty  cycle. 
In  the  short  wavelength  region,  both  curves  become  irregular  due  to  the  coupling  between  the  zero 
and  the  higher  diffractive  orders. 

3.  POLARIZING  BEAM  SPLITTER  DESIGN 


(a)  (b) 

Fig.  2.  (a)  Schematic  diagram  of  a  5-layer  ASR  polarizing  beam  splitter  operated  with  incident 

waves  at  an  angle  of  42°.  (b)  Numeric  results  for  the  reflectivity  of  TE  and  TM  polarized  waves  vs. 
wavelength  for  42°  incidence. 


97 


To  realize  a  useful  PBS  that  will  allow  us  to  separate  the  path  of  the  reflected  wave  (i.e., 
TE  polarized  wave  in  our  design)  from  that  of  the  incident  wave,  we  investigate  a  PBS  design 
that  operates  with  waves  at  large  angles  of  incidence.  Consider  a  geometry  shown  in  Fig.  2a, 

where  the  input  wave  vector  is  introduced  at  42°  angle  of  incidence,  lying  in  the  plane 
perpendicular  to  the  grating  grooves  and  parallel  to  the  grating  vector.  This  slanted  incidence 
arrangement  possess  two  additional  advantages:  1)  reflectivity  from  each  layer  for  TE 
polarization  is  increased,  thus  only  five  layers  were  needed  to  achieve  the  desired  performance 
(normal  incidence  required  seven);  2)  the  sidelobe  for  the  TM  reflectivity  is  flattened,  allowing 
operation  of  the  beam  splitter  in  a  wider  spectral  range.  Here  again  we  used  first  the  EMT 
estimates  for  an  accurate  RCWA  design.  The  thickness  of  each  layer  was  first  chosen  to  be  a 
quarter-wave  of  the  wavelength  for  TE  polarized  wave,  and  then  fine-tumed  to  set  the  minimum 
of  the  TM  reflectivity  at  the  desired  operating  wavelength  under  the  maximum  band  for  the  TE 
reflectivity.  Since  the  reflection  band  for  TE  polarization  is  very  wide,  changing  the  thickness 
mostly  affects  the  reflection  band  for  TM  polarization,  and  thus  fine-tuning  to  achieve  a  desired 
performance  is  possible.  Fig.  2b  shows  the  numeric  results  of  the  reflectance  vs.  wavelength  of 
the  slanted  incidence  optical  wave  from  the  5-layer  grating.  For  the  incident  wavelength  of 
l.Sjim,  the  TE  and  TM  reflectances  are  0.9971  and  0.0009128,  and  the  polarization  extinction 
ratio  for  reflection  side  of  the  beam  splitter  is  better  than  1100:1.  The  highest  extinction  ratio  of 
1000000:1  is  obtained  at  the  wavelength  of  'k=  1.265  |im.  A  relatively  flat  minimum  zone  of  the 
TM  reflectivity  under  the  broadband  of  the  TE  reflectivity  peak  indicates  that  broadband 
operation  is  possible. 

4.  CHARACTERIZATION  OF  THE  PBS  PERFORMANCE 


Fig.  3.  Numeric  results  for  the  reflectivity  of  TE  and  TM  polarized  waves  vs.  duty  cycle  of  the  high 
spatial  frequency  binary  grating.  The  design  parameters  are  the  same  as  using  in  Fig.  2a  except  the 
incident  wavelength  X  is  now  fixed  at  1 .3  pm  and  the  duty  cycle  F  is  varying  from  0  to  1 . 

In  order  to  find  the  tolerance  of  the  ASR  polarizing  beam  splitter  to  possible  fabrication 
errors,  we  numerically  characterize  the  performance  of  the  PBS  by  changing  the  duty  cycle  of  the 
multilayer  gratings.  We  use  the  same  design  parameters  of  the  PBS  shown  in  Fig.  2a,  except  that 
the  incident  wavelength  X  is  fixed  at  1.3  pm  and  the  duty  cycle  F  is  varied  in  the  range  0  to  1. 


98 


The  numeric  results  indicate  that  the  reflectance  of  both  polarizations  stay  approximately 
the  same  for  the  different  duty  cycles  ranging  from  0.3  to  0.55  (see  Fig.  3).  This  shows  high 
tolerance  to  fabrication  errors  of  the  ASR  polarizing  beam  splitter.  The  reflectance  of  both 
polarizations  become  closer  when  the  duty  cycle  approaches  0  and  1  and  they  become  identical 
when  the  waves  propagate  at  normal  incidence. 

(a)  (b) 


Fig.  4.  Contour  plots  of  TE  reflectance  and  TM  transmittance  vs.  incident  angles  (<t),0)  as  defined  in  Fig.  2.  (a) 
TE  reflectance  at  wavelength  X=1.3  pm,  (b)  TM  transmittance  at  wavelength  X^l.3  pm,  (c)  TE  reflectance  at 
wavelength  X^l.5  p.m,  and  (d)  TM  transmittance  at  wavelength  X^l.5  fim  . 


We  also  investigate  the  angular  dependence  of  the  ASR  polarizing  beam  splitter.  As 
shown  in  Fig.  2a,  the  angles  of  incidence  are  varied  to  span  an  angular  bandwidth  of  ±10°  in  both 
6  and  (|)  directions  defined  around  the  initial  42°  bias  angle.  The  results  shown  in  Fig.  4  indicate 
that,  at  wavelength  1.3  pm,  the  reflectance  for  TE  polarized  light  and  the  transmittance  for  TM 


polarized  light  are  both  better  than  99%  inside  the  5°  angular  bandwidth  cone,  and  better  than 
97%  inside  the  10°  angular  bandwidth  cone.  Around  1.5  pm  results  show  that  the  TE  reflectance 
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and  TM  transmittance  from  this  PBS  are  still  better  than  96%  inside  the  5°  angular  bandwidth 
cone  and  better  than  94%  inside  the  10°  angular  bandwidth  cone.  These  results  indicate  that 
wide  angular  bandwidth,  as  well  as  broad  spectral  range,  of  operation  is  possible  using  this 
design.  The  performance  of  the  PBS  can  be  further  improved  by  adding  more  layers  to  increase 
the  range  of  high  TE  reflectivity  as  well  as  bring  the  efficiency  closer  to  100%. 

Fabrication  of  such  PBS  in  the  visible  spectral  range  is  a  challenging  task  due  to  the  need 
to  fabricate  a  grating  with  sub-wavelength  grating  period.  However,  for  near  infrared  range  of 
operation,  fabrication  of  the  structure  is  practical.  For  example,  in  our  design  the  total  grating 
depth  of  5  layers  is  only  0.78|im  with  the  grating  period  0.5|im,  resulting  in  a  grating  aspect  ratio 
of  about  3:1,  which  is  well  within  the  fabrication  capabilities  of  silicon  microfabrication 
technology.  Current  thin  film  coating  technology  allows  us  to  control  the  accuracy  of  the  layer 
thickness  within  a  few  nanometers.  Therefore,  fabrication  of  the  designed  PBS  (shown  in  Fig. 
2a)  can  be  done  by  first  fabricating  the  multilayer  structure,  followed  by  direct  e-beam 
lithography  and  ion  beam  etching  a  binary  grating  profile. 

5.  CONCLUSION 

In  conclusion,  we  have  introduced  a  novel  PBS  device  that  is  based  on  the  ASR 
characteristics  of  a  high  spatial  frequency  multilayer  binary  grating.  This  PBS  combines  the 
form'  birefringence  effect  of  a  high  spatial  frequency  grating  with  the  high  reflectance  of 
multilayer  structures.  'We  use  EMT  for  initial  design  and  RCWA  for  optimization  of  the  ASR 
polarizing  beam  splitter.  We  numerically  characterize  the  ASR  polarizing  beam  splitter  in  terms 
of  polarization  extinction  ratio  for  operation  with  waves  of  wide  angular  bandwidth  and  broad 
wavelength  range.  The  results  show  that  the  ASR  polarizing  beam  splitter  not  only  provides  a 
very  high  extinction  ratio  for  the  two  orthogonal  polarizations,  but  also  can  be  operated  with 
optical  signals  of  wide  angular  bandwidth  and  broad  spectral  range.  From  the  numeric  results, 
the  fabrication  error  tolerance  of  the  PBS  has  been  shown  to  be  very  high.  Another  important 
advantage  of  the  ASR  polarizing  beam  splitters  are  their  negligible  insertion  losses  achieved  by 
using  non-absorbing  dielectric  materials.  The  ASR  polarizing  beam  splitters  combine  such 
unique  features  as  small  size,  negligible  insertion  losses,  high  polarization  extinction  ratios,  and 
operation  with  waves  of  large  angular  bandwidth  and  broad  spectral  range.  These  features  make 
these  devices  desirable  for  use  in  optical  image  processing,  optical  interconnections  as  well  as 
other  polarization  optics  applications. 
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ABSTRACT 

We  introduce  a  novel  method  of  modeling  PLZT  phase  modulators.  Traditionally,  modeling 
has  been  based  upon  fitting  the  constant  quadratic  electro-optic  coefficient  to  empirical  data.  Our 
characterization  has  shown  that  the  electro-optic  coefficient  is  not  a  constant  and  that  the  electro¬ 
optic  effect  saturates  at  electric  field  strengths  that  exist  in  standard  surface  electrode  device 
configurations.  We  have  also  found  that  the  additional  effects  of  light  scattering  and 
depolarization,  which  depend  on  the  strength  of  the  applied  electric  field,  are  significant  factors 
for  modeling  device  design  and  optimization. 

Keywords:  PLZT,  electro-optic  effect,  phase  modulators,  finite  element  analysis,  depolarization, 
scattering. 


1.  INTRODUCTION 

PLZT  is  an  excellent  material  choice  for  use  in  spatial  light  modulators  (SLM)  due  to  it  s 
large  electro-optic  effect  and  low  absorption  for  thin  wafers*.  PLZT  ceramics  are  used  as 
transverse  electro-optic  modulators  where  the  electric  field  is  applied  using  interdigital  surface 
electrodes  (ISE).  Such  electro-optic  devices  are  modeled  based  upon  the  quadratic  electro-optic 
effect^'^  as  well  as  it’s  combination  with  the  linear  electro-optic  effect .  However  these  models 
do  not  accurately  predict  the  performance  of  an  ISE  device  fabricated  at  UCSD  .  In  order  to  more 
accurately  model  such  devices  we  experimentally  characterized  PLZT’s  electro-optic  properties. 
Then  we  used  finite  element  analysis  (FEA)  to  characterize  the  field  distributions  for  ISE 
devices.  Finally,  by  combining  the  electric  field  values  provided  by  FEA  with  the  experimental 
electro-optic  data,  we  were  able  to  predict  the  performance  of  our  ISE  device.  Although  this 
methodology  was  carried  out  with  PLZT  9.0/65/35  material,  it  can  be  applied  to  modeling 
electro-optic  devices  with  arbitrary  choice  of  material,  electrode  structure  and  geometry. 

In  the  following  section  we  will  review  the  basic  theory  of  quadratic  electro-optic  materials, 
as  well  as  formulate  the  methodology  for  characterization  of  PLZT  electro-optic  material.  In 
section  3,  we  will  apply  FEA  to  model  the  electric  field  induced  by  an  electric  potential 
difference  between  electrodes  of  ISE  devices.  The  results  of  section  2  are  integrated  into  the  FEA 
model  to  determine  the  relationship  between  the  change  in  relative  phase  of  two  orthogonally 
polarized  components  of  an  incident  beam  and  the  externally  applied  electric  field.  In  section  4 
we  compare  this  prediction  with  the  actual  values  taken  from  the  fabricated  ISE  device. 
Conclusions  and  future  directions  are  discussed  in  section  5. 
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2.  CHARACTERIZATION  OF  ELECTRO-OPTIC  MATERIAL 


G.  Haertling  and  C.  Land  initiated  extensive  studies  characterizing  PLZT  ce^anlics^  They 
found  that  thin  wafers  with  compositions  containing  greater  than  8  at.%  La  had  strong  electro¬ 
optic  properties  with  transmission  values  of  close  to  100%.  At  room  temperature,  PLZT  is 
isotropic  due  to  it’s  cubic  crystallographic  structure.  When  an  external  electric  field  is  applied  the 
PLZT  material  becomes  polarized,  demonstrating  anisotropic  optical  characteristics.  This  same 
behavior  is  seen  in  crystals  such  as  BaTiOs  (in  it’s  cubic  form)  that  exhibit  primarily  third  order 
non-linear  optical  properties  which  in  turn  lead  to  quadratic  electro-optic  effects.  Assuming 
PLZT  follows  this  uniaxial  crystal  model,  the  optic  axis  will  be  determined  by  the  direction  of  an 
externally  applied  electric  field.  The  induced  ordinary  and  extraordinary  index  of  refraction  is 
determined  by 


and  n^  =  n-\n^R^^E^ 

(1) 

An(£)  =  iig  -  n,  -\n^RE^ 

(2) 

where  R  =  R^2- Rj2  and  Ru  are  the  quadratic  electro-optic  coefficients,  n  is  the  refractive 
index  of  PLZT,  and  An(£)  is  the  induced  optical  birefringence.  For  PLZT  with  9.0  at.%  La  the 
accepted  value’^  for  R  is  approximately  3.8xl0’®  Haertling  and  Land^  noticed  that  the 

induced  optical  birefringence  saturates  with  an  externally  applied  electric  field  reaching  a 
maximum’  value  of  1.1x10'^  .  More  recently,  M.  Title^  mentioned  the  saturation  effect  in 
modeling  embedded  electrode  PLZT  modulators. 
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Figure  1.  (a)  For  ISE  on  PLZT  the  index  ellipsoid  follows  the  tangent  of  the  curved  E-field  lines,  (b) 
PLZT  wafer,  300  |im  thick  and  2.01  mm  wide,  placed  between  gold  coated  copper  plates.  The  plates 
were  inserted  into  a  Teflon  base  to  ensure  good  electrical  isolation  and  the  cuvet  was  filled  with 
mineral  oil  to  prevent  current  arcing  due  to  exceeding  the  breakdown  voltage  of  air. 


A  typical  ISE  device  is  constructed  of  stripe-shaped  metal  electrodes  of  width  d  and  length  L. 
An  applied  voltage  across  such  electrodes  creates  curved  lines  of  electric  field  within  the  PLZT 
(see  Figure  la).  Since  the  optic  axis  follows  the  direction  of  the  applied  electric  field,  the  axis 
orientation  will  vary  as  a  function  of  position  within  the  PLZT.  Assuming  the  electrode  length 
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L»d,  the  index  of  refraction  parallel  and  perpendicular  to  the  surface  electrodes  can  be 
approximated^  by 


cos^  (e(a:,y))  sin'(e(x,y))1  ^ 

2  2 

K  J 


(3) 


wher=0(.,>)  =  Tan-(#l4l 

Defining  the  relative  change  in  index  An(0(x,y))  =  nj^-n|,  we  determine  the  relative  phase 
between  the  parallel  and  perpendicular  polarization  components  of  an  optical  beam  passing 
through  the  device  by 

<I>(x)  =  ^  f  An(e  (x,  y ))  Jy  (5) 

where  X  is  the  wavelength  in  vacuum  and  I  is  the  thickness  of  the  electrooptic  material. 

In  the  following  we  will  experimentally  determine  the  phase  relationship  provided  by 
Equation  5.  By  placing  the  PLZT  wafer  between  large  parallel  metal  plates  (see  Figure  lb)  a 
homogeneous  electric  field,  perpendicular  to  the  plates  (i.e.  6  =  0),  will  exist  within  the 
dielectric.  The  field  strength  is  a  function  of  the  potential  difference  between  the  plates  (i.e. 

E=z  —  ,  where  d  is  the  distance  between  the  plates).  To  measure  the  induced  phase  retardation 
d 

we  illuminate  the  PLZT  sample  using  a  normally  incident  HeNe  laser  beam.  The  incident  beam  is 
linearly  polarized  at  45°  with  respect  to  the  direction  of  the  applied  electric  field,  providing  two 
equal  components  parallel  and  perpendicular  with  respect  to  the  electrode  structure.  As  the 
voltage,  and  thus  the  electric  field  strength,  is  varied,  there  will  be  a  change  in  the  relative  phase 
between  these  two  components. 

By  placing  a  crossed  polarizer  at  the  output  of  an  ideal  phase  modulating  device,  the  light 
intensity  will  vary  as  a  function  of  the  relative  phase  according®  to  the  relation 

T  =  \a^ -ah sin{^)  -  (6) 


where  is  the  relative  phase,  and  and  iP'  are  the  transmittances  for  the  two  orthogonal 
components  of  the  light.  We  measure  the  transmittance,  T,  through  a  crossed  polarizer,  as  well  as 
the  transmittances  through  vertically  and  horizontally  oriented  polarizers,  a.  and  b  respectively, 
as  functions  of  the  electric  field.  Then,  by  solving  Equation  6  for  O,  we  expect  to  find  the  relative 
phase  as  a  function  of  electric  field.  However,  using  this  method,  the  resultant  phase  is  not 
continuous  when  a  PLZT  based  device  is  being  investigated.  This  is  due  to  depolarization  effects 
observed  in  PLZT  phase  modulators 


By  introducing  a  depolarizing  term  into  the  transmittance  of  the  orthogonal  components  in 
Equation  6,  we  obtain 


T  =  ^[a^ +  cl)  +  -^[b^ +  cl)-absm{(^) 


(7) 


where  q  and  Cb  are  the  fractions  of  depolarized  light  corresponding  to  the  incident  vertically  and 

0  0  0  0  0  0 

horizontally  polarized  components.  Defining  A  -a  +c^andB  =b  +Ci,,  we  get  the 
relationship 


sin(<D)  (8) 

where  we  also  assume  that  c^~c^  =  c.  Using  the  measured  values  for  T,A,B  and  curve  fit  for 
c^,  we  solve  Equation  8  that  provides  continuous  relative  phase  (see  Figure  2b). 


Figure  2.  (a)  Transmittance  T  through  crossed  polarizers  set  at  45°  and  through  parallel  polarizers  set 
vertically  and  horizontally,  and  as  functions  of  a  horizontally  applied  electric  field,  (b)  Relative  phase 
as  a  function  of  applied  electric  field.  We  also  show  the  best  fit  for  a  quadratic  electro-optic  coefficient  as 


^  =  6x10''*  (^)^ 


Observing  the  two  curves  A^  and  B^  of  Figure  2a  we  notice  a  dramatic  intensity  drop  after  the 
electric  field  reaches  approximately  7x10^  V/m.  The  main  reason  for  the  attenuation  is  due  to 
scattering  effects.  This  also  causes  a  corresponding  drop  in  the  transmittance,  T,  through  the 
crossed  polarizers.  Furthermore,  T  varies  sinusoidally  with  the  period  varying  as  a  function  of 
electric  field.  The  frequency  of  the  sine  initially  increases,  whereas  at  fields  above  1x10^  V/m  the 
frequency  decreases.  Above  the  value  of  2x10^  V/m  the  contrast  ratio  approaches  1:1,  which  is 
primarily  due  to  the  depolarization  effects.  Due  to  the  need  for  high  contrast  ratios  in  phase 
modulators,  scattering  and  depolarization  effects  must  be  factored  into  the  design  of  such 
devices. 
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Figure  2b  shows  the  relative  phase  vs.  electric  field,  where  the  experimental  data  is 
determined  by  solving  Equation  8.  Below  electric  fields  of  2x10  V/m  there  is  practically  no 
phase  change  in  the  PLZT.  Above  values  of  1x10^  V/m  we  observe  that  the  phase  change  begins 
to  saturate.  Our  calculations  show  that  for  PLZT  surface  electrode  devices,  there  are  regions 
where  the  field  strengths  are  on  the  order  of  2x10^  V/m,  and  therefore,  the  saturation  effect  must 
also  be  taken  into  consideration  in  these  devices. 


3.  FINITE  ELEMENT  ANALYSIS  OF  ELECTRO-OPTIC  MODULATORS 


Figure  3.  Mental’s  output  provides  added  insight  into  the  effects  of  the  electric  field 
distribution.  Shown  is  a  portion  of  the  contour  band  plot  of  the  magnitude  of  the  electric 
field  between  surface  electrodes  (outlined)  250  |im  wide  with  a  gap  of  50  pm.  For  field 
strengths  below  2.25x10^  V/m  there  is  no  phase  modulation  and  above  2.0x10^  V/m  the 
modulation  has  reached  a  maximum  (see  Figure  2b).  Therefore,  modulation  only  occurs 
within  100  pm  of  the  surface  and  near  the  edges  of  the  electrodes  we  observe  electric  field 
strengths  beyond  the  ‘saturation’  level. 

Finite  Element  Analysis  (FEA)  is  one  of  several  methods^'^’"’'^  available  for  calculating  the 
electric  field  induced  by  metal  electrodes.  We  use  Mentat,  a  commercial  FEA  program  from 
Marc  Analysis,  which  provides  an  excellent  tool  for  mesh  generation,  field  calculations  and 
visualization  (see  Figure  3).  We  used  FEA  to  determine  the  electric  field  distribution  in  PLZT 
devices  with  surface  electrodes.  A  typical  example  of  Mentat  s  output  is  shown  in  Figure  3.  For 
this  particular  ISE  configuration,  we  observe  that  100  |im  from  the  surface  the  electric  field 
strength  drops  below  the  minima  required  for  phase  modulation.  We  also  see  that  near  the  edges 
of  the  electrodes  the  field  goes  beyond  the  maxima,  i.e.  the  phase  modulation  has  become 
constant.  Using  FEA  we  are  able  to  calculate  the  electric  field  strength  and  direction,  at  any 
point,  for  any  configuration  of  electrodes. 
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To  find  an  explicit  relationship  between  the  magnitude  and  the  direction  of  the  electric  field 
and  the  relative  phase  we  substitute  Eqs.  1,  2  and  3  into  Equation  5  and  obtain 

=  +sin'(e(x,y))  ^-n)^dy  (9) 

Using  Taylor  series  expansion  this  can  be  approximated  by 

‘^(^)  =  xCol 2^^!^cos'(0(x,  y))]  '  -  n,  >dy 

^  J  (10) 

^  0)cos^(e(x,  y))dy  =  ^ y))£:’(x, y)cos^(0(j:,  y))dy 

where  we  define  the  birefringence  as  quadratic,  but  with  the  electro-optic  coefficient  also  being  a 
function  of  the  electric  field.  This  can  be  more  simply  stated  as 

y))cos^(0(x,  y))jy  ( 11) 

where  (t)(£(x:,  y))  is  the  phase  function  from  the  experimental  curve  of  Figure  2b.  Applying  the 
electric  field  results  from  FEA  we  find 

y/))cos^(0(^>  y))li  ( 12) 

1=1 

where  N  is  the  number  of  finite  elements  that  the  light  ray  passes  through  and  /,•  is  the  height  of 
each  element. 

Mentat  FEA  software  calculates  the  x  and  y  components  of  the  electric  field  (in  this  case  we 
are  using  a  two  dimensional  model)  for  four  integration  points  for  each  element.  Taking  the 

average  over  each  element  we  use  the  magnitude  of  the  field  (i.e.  E  =  ^E]  +  E^, ),  the  phase 

function  and  the  orientation  of  the  index  ellipsoid  from  Equation  4  to  get  the  relative  phase 
change  for  each  element.  Integrating  the  change  in  phase  passing  through  a  column  of  elements, 
expressed  by  Equation  12,  we  find  the  change  in  phase  for  a  light  ray  passing  through  a  line  of 
elements  (see  the  dotted  line  in  Figure  3).  Looking  at  the  series  of  columns  across  the  electrode 
gap  gives  us  a  phase  profile  for  a  plane  wave  passing  through  the  device. 

4.  MODELING  VS.  EXPERIMENTAL  PERFORMANCES 

Using  the  modeling  procedures  discussed  in  section  3  we  determine  the  calculated  phase 
distribution  for  a  simulated  ISE  device  and  compare  it  to  that  found  experimentally  using  a 
fabricated  device.  The  FEA  modeling  and  the  experimentally  measured  results  are  found  to  be  in 
good  agreement  for  voltages  of  less  than  and  between  electrode  gaps  of  about  500  |im  (see 
Figure  4a).  For  narrow  gaps  and  higher  voltages  (see  Figure  4b)  there  is  a  difference  between  the 
modeling  and  experimental  results.  One  possible  explanation  would  be  that  the  electric  field 
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strength  is  weaker  then  the  value  calculated  by  the  FEA  model  due  to  screening  effects.  These 
screening  effects  occur  when  free  carriers  (photo-induced  or  due  to  surface  states)  create  a  space- 
charge  distribution  near  the  electrodes.  Our  future  work  will  entail  investigating  these 
phenomena. 


Noim  alized  Txansm  ittance  vs.Applied  Voiage 
500  pun  gap  and  2  mm  electrodes  50  ^un  gap  and  250  pm  electrodes 


Voltage  ^7) 


(a)  (b) 

Figure  4.  Comparison  of  FEA  model  to  experimental  data  for  transmission  through  crossed 
polarizers,  (a)  Shows  a  relatively  good  fit  for  and  V2,  (i.e.  the  first  maxima  and  minima)  for  an 
electrode  gap  of  500  pm.  Whereas  (b)  shows  a  poor  fit  with  a  narrow  electrode  gap  of  50  pm. 


Currently,  in  our  model,  we  are  compensating  for  this  ‘weakening’  effect  by  introducing  a 
constant  factor.  Consequently,  we  are  able  to  accurately  model  the  behavior  of  fabricated  devices 
operated  at  voltages  less  than  (see  Figure  5a).  Notice  that  the  ‘integrated  phase’  depends 
linearly  on  the  applied  voltage  in  contrast  to  the  quadratic  behavior  predicted  by  previous 
models. 

For  surface  electrode  based  devices,  the  two  parameters  that  are  most  important  are  the  electrode 
width  and  the  size  of  the  gap  between  electrodes.  By  varying  the  gap  size,  and  holding  the  width 
(160  [im)  and  voltage  (150  V)  constant,  in  our  FEA  model  we  have  determined  that  as  the  gap 
decreases  the  phase  shift  increases  proportionately.  This  is  expected  based  upon  the  linear 
relationship  of  phase  to  electric  field,  since  the  electric  field  strength  scales  proportionately  with 
gap  size.  However,  when  the  gap  size  becomes  smaller  than  40  |im  there  is  very  little  increase  in 
phase  modulation.  This  is  due  to  electric  field  values  being  above  the  saturation  level.  We  also 
find  that  as  the  electrodes  increase  in  size  there  is  a  corresponding  increase  in  phase  shift,  but 
electrodes  wider  than  160  p.m  were  also  of  little  benefit  (see  Figure  5b).  The  conclusion  is  that 
the  optimal  configuration  for  this  material  is  a  40  p.m  gap  between 

( 
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160  |im  wide  electrodes.  It  was  experimentally  determined®  that  for  interdigital  surface  electrodes 
on  PLZT  8.8/65/35  the  optimum  configuration  is  160  |im  electrode  widths  with  40  )im  between 
electrode  gaps.  From  this  excellent  agreement  we  conclude  that  optimization  of  design 
parameters  can  be  done  successfully  using  our  FEA  model. 


Phase  VS  .Volage 


Volage  ft/) 


(a) 


Phase  vs. EfectodeW 


EhctiDdeW  iifti  (  pm ) 


electrode 


40  fim 


(b) 


Phase  PiDfih  vs.VolBge 


(C) 


Figure  5.  (a)  Comparison  of  the  FEA  model  with  experimental  data  for  phase  as  a  function  of  applied 
voltage  for  surface  electrodes  160  pm  wide  with  a  40  pm  gap  (after  including  a  compensating  constant 
factor),  (b)  Holding  the  gap  width  (40  pm)  and  voltage  (150  V)  constant,  the  results  of  the  FEA  simulation 
show  that  as  electrode  width  approaches  160  pm  the  increase  in  phase  modulation  slows  down,  (c)  The 
gradient  of  the  phase  front  for  various  applied  voltages  for  etched  electrodes  40  pm  deep.  A  relatively  flat 
phase  profile  across  the  entire  gap  can  be  realized  around  50  Volts. 


Gap  Width  (|im) 

Electrode  Width  (jxm) 

Electrode  Depth  (|im) 

V,  (V) 

80 

20 

100 

56.5 

60 

15 

75 

46.0 

40 

10 

50 

36.0 

20 

5 

25 

30.0 

Table  1.  The  reduction  in  the  necessary  applied  voltage  to  achieve  n  phase  modulation  highlights  the 
advantages  of  ‘scaling’  down  the  electrode  size  and  spacing. 


With  our  general  modeling  approach,  by  varying  other  parameters  within  our  FEA  model  we  can 
design  and  evaluate  many  device  configurations  without  having  to  fabricate  many  devices.  In  the 
following  we  will  briefly  discuss  the  optimization  of  two  independent  design  criteria,  embedded 
electrode  geometry  and  phase  uniformity.  One  of  the  limitations  in  using  ISE  devices  is  their  low 
transmittance  due  to  a  small  gap  to  electrode  ratio,  i.e.  small  fill  factor.  In  contrast  to  surface 
electrodes,  much  larger  fill  factors  can  be  realized  by  using  electrodes  that  are  embedded  into  the 
surface.  To  find  the  effect  of  using  different  etch  depths  we  hold  other  parameters  constant  and 
change  the  scale  of  the  electrode  structure.  Using  a  fill  factor  of  80%  and  assuming  that  a  5:1 
aspect  ratio  is  possible  in  etching  PLZT  substrate,  we  observe  that  decreases  steadily  as  the 


no 


gap  gets  smaller  (see  Table  1),  indicating  that  the  decrease  in  switching  energy  is  proportional  to 
the  decrease  in  gap  size.  Under  the  constraint  of  an  80%  fill  factor  we  observe  that  the  linear 
relationship  no  longer  holds  for  a  gap  size  below  40  pm.  For  small  geometries  the  electric  fields 
need  to  be  high  due  to  the  short  active  modulation  path  length.  Consequently,  we  once  again 
observe  the  effects  of  ‘saturation’  of  the  phase  modulation. 

Another  important  characteristic  of  an  electro-optic  modulator  is  the  homogeneity  of  the 
modulated  phase  front.  Our  FEA  modeling  allows  us  to  calculate  the  phase  distribution  across 
the  aperture  between  the  electrodes.  A  plane  wave  passing  through  the  gap  between  electrodes 
will  attain  a  phase  curvature  across  the  aperture  depending  on  the  field  distribution.  According  to 
our  model,  for  a  50  pm  wide  gap  at  150  Volts,  there  is  a  phase  difference  of  approximately  0.7 
radians  from  center  to  edge.  By  simulating  various  electrode  geometries  and  applied  voltages  we 
find  that  by  using  electrodes  60  pm  wide,  spaced  40  pm  apart  and  etched  40  pm  deep  the  wave 
front  has  an  almost  perfectly  flat  k  phase  profile  at  53.6  Volts  (see  Figure  5c).  It  is  important  to 
note  here  that  we  have  been  analyzing  one  characteristic  at  a  time.  In  order  to  find  an  optimum 
device  configuration,  many  coupled  characteristics  must  be  taken  into  account  and  weighted 
according  to  specific  device  requirements. 

5.  CONCLUSION 

We  have  used  a  uniform  applied  electric  field  within  PLZT  in  order  to  experimentally 
characterize  the  electro-optic  response  of  the  material.  This  characterization  has  highlighted  the 
fact  that  scattering  and  depolarization  effects  need  to  be  considered  in  determining  the  phase 
function.  Furthermore,  electric  field  distributions  obtained  using  various  electrode  configurations 
have  been  calculated  using  FEA.  These  resultant  electric  fields  were  integrated  with  the  phase 
function  of  the  material  to  determine  the  characteristic  phase  modulation  of  an  ISE  electro-optic 
device.  The  calculated  strength  of  the  electric  fields  has  shown  us  that  ‘saturation’  of  phase 
modulation  needs  to  be  considered  in  device  design.  We  have  also  found  that  an  electric  field 
‘weakening’  effect  needs  to  be  factored  into  the  model  and  in  the  future  we  will  investigate  the 
cause  of  this  phenomena.  After  compensating  for  these  various  effects  we  are  able  to  model  the 
behavior  of  a  device  as  a  function  of  a  variety  of  parameters.  We  have  shown  that  this  model  is 
useful  in  optimizing  individually  such  device  characteristics  as  increased  transmittance  and 
homogeneity  of  the  phase  front.  Multiple  characteristics  of  such  devices  that  are  mutually 
coupled  can  also  be  optimized. 
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Abstract 

PLZT  8.8-9.5/65/35  electrooptic  (EO)  devices 
subject  to  electric  fields  on  the  order  of  1  can 

become  highly  scattering  and  depolarizing  as  well  as 
exhibit  high-order  EO  effects.  We  describe  a  method 
of  modeling  EO  modulators  that  encounters  for  these 
effects.  Utilizing  these  characteristics  in  simulating 
surface  electrode  devices  we  compare  our  model  to 
measurements  of  a  fabricated  device  and  find 
excellent  correlation. 

Key  Words 

PLZT  modeling,  Scattering  and  Depolarization, 
Optical  Characterization,  Mueller  Matrices. 

Introduction 

Applying  the  advances  in  electronic  and  optical 
computer  aided  design  (CAD)  to  optoelectronic 
svstems  requires  precision  simulation  of  electrooptic 
(EO)  devices.  In  this  paper  we  present  a  simple  but 
accurate  method  of  modeling  EO  devices  using,  as  an 
example,  lanthanum-modified  lead  titanate  zirconate 
(PLZT)  with  compositions  9.Xy65/35.  These 
ferroelectric  ceramics  have  strong  EO  effects  and  are 
cost  effective  and  are  therefore  excellent  candidates 


for  use  in  optoelectronic  systems.  In  the  absence  of 
field  the  ceramic  is  optically  isotropic  whereas  an 
applied  field  induces  anisotropy  and  optical 
birefringence.  This  electrooptic  response  has  been 
modeled  as  a  classic  Kerr  quadratic  effect  [1]  as  well 
as  a  combination  of  linear  and  quadratic  effects  [2]. 
However,  these  techniques  fall  short  of  accurately 
modeling  fabricated  devices,  which  will  be  required 
for  optoelectronic  CAD  tools. 

We  have  developed  a  simple  method  of 
characterizing  a  sample  of  EO  material  (e.g.  PLZT) 
using  a  set  of  equations  that  relate  scattering, 
depolarization  and  electrooptic  effect  to  applied 
electric  field.  By  integrating  these  responses  with  the 
calculation  of  the  field,  for  given  device  geometries 
and  driving  voltages,  we  have  achieved  excellent 
correlation  to  experimental  data.  In  this  paper  we 
describe  our  EO  characterization  and  device 
modeling,  with  an  emphasis  on  comparisons  of 
computer  simulation  to  fabricated  devices. 

Transmittance  Model  and  Characterization 

Previous  work  [3]  and  experimental  evidence  shows 
that  light  passing  through  PLZT-based  devices 
experiences  depolarization  due  to  multiple  scattering 
effects.  Therefore,  we  use  the  coherency  matrix 
formalism,  which  describes  partially  polarized  light 
by  taking  into  account  polarized  and  unpolarized 
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components,  to  generate  an  equation  describing  the 
intensity  of  a  monochromatic  plane  wave  transmitted 
through  our  apparatus.  For  example,  for  linearly 
polarized  light  passing  through  crossed  polarizers  set 
at  45°  the  equation  relating  transmitted  intensity  to 
chanae  in  relative  phase  is  given  by 

I  =  + B  +  2C±2^fABcos(^)  (1) 

where  A  and  B  represent  the  polarized  light  intensity 
measured  along  the  two  orthogonal  basis  and  C  is  the 
unpolarized  component,  (ji  is  the  change  in  relative 
phase  between  the  two  orthogonal  components. 
Choosing  four  pairs  of  angles,  for  the  incident 
polarizer  and  analyzer,  gives  four  equations  which 
are  used  to  solve  for  the  relative  change  in  phase  (j). 

By  sandwiching  our  PLZT  sample  between  two 
larae.’parallel,  gold-plated  copper  plates  we  generate 
a  constant  field  inside  and  outside  the  ceramic.  The 
distance  between  plates  and  the  thickness  of  the 
PLZT  aive  us  the  electric  field  strength  and 
interactidn  length,  respectively.  We  experimentally 
measure  the  normalized  optical  intensity  for  various 
orientations  of  the  polarizer/analyzer  pairs  (see 
Figure  1 ).  For  incident  light  polarized  at  45°  with  a 
parallel  analyzer  we  see  a  sinusoidal  variation  of  the 
intensity  as  the  relative  phase  is  changed  with  applied 
voltaae.  as  predicted  in  Eq.  1.  In  the  measurement  of 
the  vertical  (0°)  and  horizontal  (90°)  components  we 
see  the  effects  of  scattering  and  depolarization  as  the 
field  is  increased. 


Figure  1.  The  sinusoidal  curve  is  the  transmission 
through  parallel  polarizers  oriented  at  45°  to  the  direction 


of  the  electric  field.  The  mismatch  betw-een  this  curve  and 
the  ‘envelope’  is  due  to  a  wedge  shaped  PLZT  sample  that 
results  in  slightly  varying  path  length  within  the  sampled 
region. 

Using  the  transmission  data  from  Figure  1,  we 
solve  for  the  change  in  relative  phase  <|)  as  a  function 
of  electric  field  (see  Figure  2).  To  accurately  curve  fit 
the  EO  data  we  need  to  use  at  least  a  fifth  order 
polynomial,  indicating  that  the  EO  behavior  of  these 
PLZT  ceramics  cannot  be  fully  described  by  the 
quadratic  EO  effect. 


PLZT  8.9/65/35  composition. 


Device  Modeling 

For  a  plane  wave  propagating  through  a  PLZT 
based  device  we  can  use  a  standard  index  ellipsoid 
approximation  [4]  to  find  the  index  change  for  any 
two  orthogonal  polarization  components  of  the  light. 
We  use  a  commercial  finite  element  analysis  (FEA) 
tool  to  calculate  the  electric  field  distribution  for 
arbitrary  device  geometries.  By  integrating  the 
change  in  index  of  refraction,  scattering  and 
depolarization  data  with  the  calculated  electric  field 
components  we  can  determine  the  change  in  index  as 
a  function  of  position  for  any  given  device  geometry. 

For  incident  beam  linearly  polarized  at  45°  with 
respect  to  the  electrodes  and  using  an  analyzer  placed 
at  -45°  we  compare  our  simulation  with  a  fabricated 
surface  electrode  device.  For  this  comparison,  we 
used  1  mm  CrAu  electrodes  spaced  500  pm  apart  and 
averaged  the  transmitted  intensity  from  the  center 
100  pm  of  the  gap.  As  can  be  seen  in  Figure  3,  we 
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get  an  excellent  match  of  device  behavior,  including 
the  ‘flat’  response  at  low  voltage,  the  reduction  of 
maximum  intensity  due  to  scattering  and  the  decrease 
in  contrast  due  to  an  unpolarized  bias  component. 


Voltage  (V) 


Figure  3.  Comparison  of  measured  performance  of 
fabricated  surface  electrode  PLZT  device  (dots)  and 
computer  simulation  using  new  modeling  method  (line). 
Device  has  2mm  wide  CrAu  electrodes  separated  by  a  500 
pm  gap  and  evaporated  onto  a  380  pm  thick  wafer. 


Conclusion 

By  using  four  polarizer/analyzer  orientations  and 
taking  measurements  of  optical  response  versus 
applied  voltage  we  generate  sufficient  information  to 
calculate  the  relative  change  in  index  of  refraction  for 
PLZT.  From  these  measurements  we  extract 
information  on  the  scattering,  depolarization  arid 
change  in  relative  phase.  By  incorporating  that  same 
information  into  simulating  device  behavior  we  then 


accurately  model  arbitrary  geometries  for 
polarization  rotation  devices.  We  are  currently  using 
this  technique  to  optimize  device  design  for  a  variety 
of  performance  characteristics  including  maximum 
contrast,  efficiency  and  minimized  cross  talk.  This 
empirically-based  modeling  of  electrooptic  devices  is 
essential  for  developing  accurate  and  reliable  CAD 
tools  necessary  for  design  of  future  optoelectronic 
systems 
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Folded  free-space  polarization-controlled 
multistage  interconnection  network 


Dan  M.  Marom,  Paul  E.  Shames,  Fang  Xu,  and  Yeshaiahu  Fainman 


We  present  a  folded  free-space  polarization-controlled  optical  multistage  interconnection  network  (MIN) 
based  on  a  dilated  bypass- exchange  switch  (DBS)  design  that  uses  compact  polarization-selective  dif¬ 
fractive  optical  elements  (PDOE’s).  The  folded  MIN  design  has  several  advantages  over  that  of  the 
traditional  transparent  MIN.  including  compactness,  spatial  filtering  of  unwanted  higher-order  diffrac¬ 
tion  terms  leading  to  an  improved  signal-to-noise  ratio  (SNR),  and  ease  of  alignment.  We  experimen¬ 
tally  characterize  a  folded  2x2  switch,  as  well  as  a  4  x  4  and  an  8  x  8  folded  MIN  that  we  have  designed 
and  fabricated.  We  fabricated  an  an^ay  of  off-axis  Fresnel  lenslet  PDOE’s  with  a  30:1  SNR  and  used  it 
to  construct  a  2  X  2  DBS  with  a  measured  SNR  of  60:1.  Using  this  PDOE  array  in  a  4  x  4  MIN  resulted 
in  an  increased  SNR  of  120:1,  highlighting  the  filtering  effect  of  the  folded  design.  ©  1998  Optical 
Society  of  America 

OCIS  codes:  090.1760,  060.4250,  200.4650,  060.1810,  200.2610,  230.5440. 


1.  Introduction 

As  the  demand  for  communication  and  computing 
services  increases,  there  is  a  correlated  growth  in  the 
need  to  switch  among  large  numbers  of  input-output 
ports  that  carry  high-bandwidth  signals.  Multi¬ 
stage  interconnection  networks  (MIN’s)  are  an  at¬ 
tractive  switching  architecture  because  of  the 
minimal  number  of  switching  elements  required.^ 
An  optical  MIN  switching  system  routing  high- 
bandwidth  optical  signals  can  play  an  important  role 
in  the  development  of  ultrahigh-bandwidth  inter¬ 
faces  with  high-capacity  parallel-access  optical  mem¬ 
ories  as  well  as  for  memory  distribution.  Various 
optical  MIN  system  implementations  have  been 
reported,  including  guided-wave  optics  that  use 
LiNb03  switches, 2  free-space  optics  with  optoelec¬ 
tronic  switches  based  on  symmetrical  self-electro¬ 
optic  effect  devices, 3  and  transparent  switches 
based  on  polarization  modulators.^-®  Transparent 
switches,  in  which  an  optical  signal  propagates  with¬ 
out  regeneration,  do  not  introduce  the  additional  lim¬ 
itations  of  optoelectronic  device  cost,  speed,  and 
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power.  Totally  transparent  systems  depend  on  a 
centralized  controller’s  performing  the  routing  algo¬ 
rithm  and  serving  user  requests.  Recently  such  a 
4X4  free-space  optical  MIN  based  on  polarization- 
selective  diffractive  optical  elements  (PDOE’s)  was 
demonstrated.®  Optical  MIN’s  have  inherent  inser¬ 
tion  losses  that  can  limit  system  scalability.  How¬ 
ever,  attenuation  can  be  compensated  for  by  use  of 
optical  fiber  amplifiers.  Polarization-dependent 
systems  can  also  take  advantage  of  the  recent  ad¬ 
vances  in  polarization-maintaining  fibers  or  compen¬ 
sation  in  single-mode  fibers.^®  Other  performance 
metrics  that  can  limit  MIN  scalability  are  optical 
cross  talk,  system  compactness,  system  stability,  and 
ease  of  alignment. 

In  this  paper  we  describe  the  design  and  implemen¬ 
tation  of  a  free-space  optical  MIN  by  use  of  a  novel 
folded  dilated  bypass-exchange  switch  (DBS)  built  of 
PDOE’s  that  addresses  these  limitations.  The  use  of 
a  DBS  allows  for  the  elimination  of  the  first-order  cross 
talk  that  results  from  inaccuracies  of  polarization  ro¬ 
tational  and  diffractive  optical  element  fabrication  er¬ 
rors.  By  utilizing  the  three-dimensional  functionality 
of  the  optical  elements,  one  can  then  stack  the  DBS 
elements  in  the  vertical  dimension  by  folding  (by  use  of 
a  min^or  plane)  the  switch  along  a  central  line  of  sym¬ 
metry.  The  interconnection  among  multiple  DBS’s 
can  also  be  folded,  forming  an  optical  MIN.  This  re¬ 
sults  in  a  resonator-type  structure  in  which  all  the 
switching  elements  are  distributed  on  a  plane,  provid¬ 
ing  a  highly  compact  optical  system  that  can  easily  be 
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Input  1 


Input  2 


BCGH  1  BCGH  2 


Output  1 


Output  2 


Fig.  1.  BES  functionality  block  diagram:  solid  lines,  bj-pass 
mode  in  which  the  input  to  channel  1  goes  to  the  output  of  channel 
1;  dashed  lines,  exchange  mode  in  which  the  input  to  channel  1 
goes  to  that  output  of  channel  2  and  vice  versa. 

aligned.  Additionally,  the  use  of  space-variant  (hf- 
fractive  optics  permits  the  implementation  of  any  in¬ 
terconnection  topology  corresponding  to  arbitraiy 
network  architectures.  Finally,  the  use  of  micromir- 
rors  to  fold  the  DBS  allows  for  spatial  filtenng  of  the 
undesired  diffraction  orders,  thereby  decreasing  the 

cross  talk.  ...  . 

In  Section  2  we  review  MIN  switching  concepts 
based  on  PDOE’s.  In  Section  3  we  introduce  the 
folded  implementation  of  the  DBS  by  use  of  PDOE  s. 

In  Section  4  we  discuss  system  design  and  component 
fabrication  and  characterization.  Performance  eval¬ 
uation  of  a  2  X  2  folded  DBS  as  well  as  multistage  4  x 
4  and  8  X  8  folded  optical  MIN’s  is  presented  in 
Section  5.  Finally,  in  Section  6  we  summarize  our 
research  and  discuss  conclusions. 

2.  Free-Space  Optical  Multistage  Interconnection 
Networks  with  Bypass-Exchange  Switches 
The  basic  structure  of  a  MIN  has  alternating  arrays  of 
fixed  interconnection  patterns  and  switching  modu^s, 
typically  bypass-exchange  switches  (BES  s).  me 
MIN  architecture  determines  the  fixed  interconnection 
pattern  between  switching  stages  and  the  number  of 
stages  implemented.  Various  MIN  architectures  are 
differentiated  by  the  number  of  switching  stages,  com¬ 
plexity  of  routing  algorithms,  and  network  protocols. 
An  optical  MIN  implemented  vdth  space-vanant  len- 
slets  in  free  space  permits  the  implementation  of  ar¬ 
bitrary  network  architectures  and  intf«onn®ction 
patterns.  Here  we  demonstrate  a  folded  8^8  MIN 
based  on  a  banyan  architecture.  However,  the  cross¬ 
talk  and  fabrication  issues  addressed  also  apply  to  any 
other  architecture  implementation.  ,  *  * 

The  BES  is  a  2  X  2  switch  with  two  allowed  states: 
bypass,  in  which  the  signals  of  the  two  channels  are 
unchanged  (i.e.,  the  input  to  channel  1  goes  to  the 
output  of  channel  1),  and  exchange,  in  winch  the 
signals  go  to  the  opposite  output  ports  or  channels 
(Fig.  1).  Other  possible  states,  known  as  broadcast 
and  combine,  are  not  considered  in  this  application. 
A  possible  optical  implementation  of  the  BES  uses 
the  polarization  state  for  switching.  Two  orthogo¬ 
nally  polarized  light  beams  are  controlled  by  a  polar¬ 
ization  rotator  to  set  the  state  of  the  switch  and  the 
polarization-selective  optical  elements  (e.g.,  polariza¬ 
tion  beam-splitter  cubes,  birefringent  crystals,  and 
PDOE’s)  to  direct  the  beams. 

A  birefringent  computer-generated  hologram 


Output  1 


■Output  2 


Polarization 

Rotator 

Fig.  2.  Optical  implementation  of  the  BES:  the  first  BCGH  col- 
limates  the  two  input  beams  and  the  second  BCGH  directs  the 
output  beams  depending  on  the  polarization  states.  The  voltage 
on  the  polarization  rotator  determines  the  state  of  the  swtch. 

(BCGH)  is  an  example  of  a  compact  and  efficient 
PDOE.  A  BCGH  element  has  an  independent  im¬ 
pulse  response  for  each  state  of  the  two  orthogonal 
linear  polarizations,  achieved  by  the  etching  of  phase 
encodings  into  birefringent  media.  A  compact  2  x  / 
optical  BES  that  uses  BCGH  elements  has  been  dem- 
onstrateds  (pig.  2):  The  first  BCGH  element  com- 
bines  and  focuses  two  inputs  into  the  polanzation 
rotator,  which  either  exchanges  their  polarizations  or 
does  not.  The  second  BCGH  separates  and  directs 
the  outputs  to  different  destinations  according  to 
their  polarization  states. 

Inaccuracies  of  the  polarization  rotator  and  the 
BCGH  fabrication  can  result  in  cross  talk  in  this 
implementation  of  the  BES.  The  polarization  rota- 
tor  can  be  characterized  by  an  associated  error  of  o  in 
the  rotation  angle,  which  results  in  a  cross-talk  teiro 
proportional  to  sin(l8l).  The  BCGH  elements  can  be 
described  by  an  associated  cross  talk  e  that  is  due  to 
fabrication  errors  such  as  etch  depth  and  misalign¬ 
ment  among  multiple  masks.  The  combined  cross¬ 
talk  component  at  the  output  of  the  BEb  is 
proportional  to  \h\  +  iel,  assuming  that  S,  e  <<  1.  The 
signal-to-noise  ratio  (SNR)  of  a  MIN  can  be  descnbed 
by 


SNR  =  log^-J- logic  S.  (1) 

where  6,  =  l5l  +  lej  and  S  is  the  number  of  intercon¬ 
nection  stages.®  For  increased  scalability  of  the 
MIN  network  size  (i.e.,  S  is  growing)  the  cross  telk 
of  each  stage  must  be  reduced,  yielding  the  SNR  nec¬ 
essary  to  support  the  desired  bit-error  rate. 

An  improvement  in  cross-talk  performance  can  be 
achieved  by  use  of  a  DBS."  The  DBS,  which  has 
two  input  and  two  output  signals,  compnses  four  1  x 
2  switches  coupled  together.  The  structure  of  the 
DBS  guarantees  that  each  1x2  switch  has  only  one 
si<^nal  propagating  through  it  and  that  the  majority 
of  the  cross-talk  terms  exit  from  the  unutilized  output 
ports  It  can  be  shown  that  the  remaining  cross  talk 
8  is  now  reduced  to  8'"  +  Under  the  assumption 
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BCGH  1  I  BCGH  2 


Symmetry  Line 


Fig.  3.  Optical  implementation  of  the  DBS.  Each  BCGH  ele¬ 
ment  performs  1x2  switching,  depending  on  the  state  of  the 
polarization-rotator  element.  The  input  and  the  output  states  of 
the  two  channels  are  identical,  permitting  filtering  with  a  polarizer 
of  linear  cross  talk  at  the  output. 


that  8,  e  <5C  1,  this  perfonnance  is  a  significant  im¬ 
provement  over  that  of  a  conventional  BES. 

A  free-space  DBS  can  be  implemented  with  a  com¬ 
bination  of  four  BCGH’s  and  four  polarization-rotator 
elements  (Fig.  3).  Depending  on  the  state  of  the  first 
set  (i.e.,  the  first  elements  in  channels  1  and  2)  of 
polarization  rotators,  the  first  set  of  BCGH  elements 
defines  the  bypass  or  exchange  functionality  of  the 
switch.  The  second  set  of  BCGH  lenslets  directs  the 
output  beams  to  the  next  DBS  array  (for  a  multistage 
configuration),  where  the  direction  is  dictated  by  the 
interconnection  architecture  that  is  being  imple¬ 
mented.  The  second  set  of  polarization  rotators  re¬ 
turns  the  output  polarizations  to  their  original  input 
states. 

Unlike  in  the  BES,  in  a  DBS  the  polarization  state 
of  each  channel  remains  independent  of  the  others. 
In  our  case  we  specify  the  input  and  the  output  beams 
to  have  identical  polarization  states.  Therefore,  by 
placing  a  polarizer  at  the  output  of  the  DBS,  we  can 
eliminate  the  linear  cross  talk.  In  this  case  the  four 
polarization-rotator  elements  will  always  be  in  the 
same  states,  i.e.,  all  on  or  all  off. 

3.  Folded  Dilated  Bypass-Exchange  Switch  and 
Optical  Multistage  Interconnection  Network 

The  DBS’s  complexity,  although  it  mitigates  linear 
cross-talk  problems,  increases  the  number  of  compo¬ 
nents  required  for  it  to  have  the  same  functionality  as 
the  BES.  However,  one  can  reduce  the  complexity  of 
these  switches  by  taking  advantage  of  the  symmetry 
of  the  DBS  (dashed  line  in  Fig.  3)  and  the  three- 
dimensional  functionality  of  our  free-space  optical 
elements.  One  does  this  by  introducing  a 
propagation-direction  component  along  the  vertical 
axis,  i.e.,  a  small  incidence  angle,  as  well  as  by  plac¬ 
ing  a  mirror  at  the  line  of  symmetry  (Fig.  4).  The 
input  beams  will  pass  through  a  rotator-BCGH  com¬ 
bination  at  one  elevation  and  react  according  to  the 
encoded  information  at  that  location,  switching  infor¬ 
mation  in  the  horizontal  direction.  On  reflection 


Fig.  4.  Folded  optical  DBS.  Similar  elements  (i.e.,  BCGH’s,  po¬ 
larization  rotators)  are  placed  in  two-dimensional  arrays.  Mi¬ 
cromirrors  reflect  only  the  desired  diffraction  order  and  filter  out 
the  unwanted  orders.  The  four  polarization  rotators  are  always 
in  the  same  state  and  can  be  replaced  by  one  larger-sized  element. 


from  the  mirror  the  beam  passes  through  another 
BCGH-rotator  combination  but  at  a  lower  elevation 
(Fig.  4).  By  folding  the  switch  in  this  manner,  one 
locates  similar  elements  (e.g.,  BCGH  lenslets)  in  the 
same  plane.  Therefore  a  single  DBS  can  be  fabri¬ 
cated  by  use  of  a  mirror  and  2x2  arrays  of  BCGH’s 
and  polarization-rotator  elements.  The  four  polar¬ 
ization  rotators,  which  are  always  in  identical  states, 
can  be  replaced  with  one  larger  polarization-rotator 
element.  However,  if  we  wish  to  consider  other 
switching  functionalities  such  as  broadcast  and  com¬ 
bine  states,  the  four  polarization  rotators  have  to  be 
controlled  separately. 

The  advantage  of  this  folding  technique  is  further 
enhanced  when  it  is  applied  to  an  optical  MIN. 
When  a  mirror  is  placed  at  the  output  of  the  first 
folded  DBS,  the  beam  will  reflect  back  at  a  lower 
elevation  and  be  coupled  into  subsequent  DBS’s  lo¬ 
cated  below  the  first.  In  this  manner  all  similar 
elements  of  multiple  DBS’s  can  be  combined  into  two- 
dimensional  arrays,  minimizing  the  number  of  com¬ 
ponents  required  for  the  entire  MIN:  a  single 
BCGH  array,  a  polarization-rotator  array,  and  a  pair 
of  folding  micromirror  arrays.  A  folded  optical  MIN 
is  packaged  as  a  resonator  in  which  each  round  trip 
represents  a  stage  and  all  stages  are  stacked  verti¬ 
cally  (Fig.  5).  An  input  signal  beam  enters  the  sys¬ 
tem  at  a  small  angle  and  reflects  through  a  prescribed 
number  of  stages  before  exiting  in  the  desired  spatial 
output  channel. 

Implementation  of  a  free-space  optical  MIN  by  use 
of  this  folding  technique  and  BCGH  space-variant 
lenslets  presents  several  unique  advantages: 

(a)  Arbitrary  architecture:  The  use  of  space- 
variant  lenslets  in  each  polarization-selective  ele¬ 
ment  allows  for  the  design  of  arbitrary  connection 
patterns  such  that  any  multistage  network  topology 
can  be  implemented.  In  the  folded  optical  MIN  the 
number  of  channels  and  the  interconnection  architec¬ 
ture  used  dictate  the  size  of  the  arrays  but  do  not 
increase  the  number  of  components.  For  example, 
an  8  X  8  optical  MIN  architecture  (of  log2  8  =  3 
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Fig.  5.  Folded  8x8  optical  MIN  with  one  input  beam,  shown 
propagating  from  channel  1  to  channel  5.  In  this  example  there 
are  three  stages  of  DBS's,  which  require  three  round-trip  travels  in 
the  micromirror  cavity. 


stages)  requires  arrays  of  size  8  x  6  in  BCGH  s  and 
polarization-rotator  elements. 

(b)  Spatial  filtering:  A  further  advantage  of  the 
folded  switch  is  that  the  miiTor  planes  can  also  im¬ 
plement  filtering  functionality  to  hicrease  SNR  per¬ 
formance  further.  BCGH’s  are  diffractive  elements 
that,  depending  on  the  element  design,  can  produce 
undesired  diffraction  tenns.  When  continuous  mir¬ 
rors  are  used  these  undesired  orders  can  propagate 
within  the  MIN,  resulting  in  additional  cross-talk 
noise  at  the  output.  However,  if  micromirrors  de¬ 
posited  upon  a  transparent  substrate  are  used,  only 
the  desired  diffraction  orders  from  the  BCGH’s  will 


reflect  back  for  further  propagation,  while  the  un¬ 
wanted  noise  terms  exit  the  system. 

(c)  Alignment:  The  arrangement  of  the  optical  el¬ 
ements  in  a  two-dimensional  array  format  also  allows 
for  relatively  simple  alignment  of  the  system  compo¬ 
nents.  Correct  alignment  will  dictate  that  the 
beams  land  on  the  correct  elements  during  each  pass 
through  the  cavity.  The  displacement  of  each  beam 
from  its  coiTect  position  and  the  size  of  the  beam  at 
the  BCGH  elements  (i.e.,  larger  or  smaller  than  the 
predicted  size  at  the  element)  will  indicate  which 
optical  elements  (BCGH’s,  micromirrors,  etc.)  are  in- 
coirectly  positioned.  Inasmuch  as  the  micromirror 
planes  are  mostly  transparent,  beam  propagation 
within  the  cavity  can  be  \dewed  wdth  external  imag¬ 
ing  optics  and  a  CCD  camera.  The  beam  size  and 
position  can  therefore  be  monitored  in  situ,  allowng 
for  accurate  alignment  of  optical  elements  and  mirror 
planes. 

4.  System  Design,  Component  Fabrication,  and  Their 
Characterization 

To  demonstrate  a  folded  system,  we  designed  and 
constructed  an  8  x  8  optical  MIN  system  (Fig.  6) 
based  on  a  fully  connected  banyan  architecture.® 
The  design  process  of  the  system  incorporates  the 
following  criteria:  (i)  maximization  of  the  number  of 
rings  in  off-axis  Fresnel  lenslets,  (ii)  minimum  fea¬ 
ture  size  of  diffractive  elements  determined  by  the 
available  fabrication  technologies,  and  (iii)  separa¬ 
tion  of  diffractive-order  beams  at  the  micromirror 
plane.  We  developed  a  system-modeling  tool  by  us¬ 
ing  Gaussian  beam  analysis  of  the  stable  mode  of  the 
micromirror-based  (Fig.  5)  cavity  that  calculates 


rotation. 


Fig.  7.  CCD  Image  of  a  single  BCGH  element  within  an  S  x  6 
array,  showing  multifunctional  superposition  of  polanzation- 
selective  Fresnel  lenslets. 

these  three  parameters  for  a  given  cavity  dimension. 
We  found  the  optimal  size  of  the  cavity  by  vaiying  the 
cavity  dimensions  over  our  design  space  and  maxi¬ 
mizing  the  above  criteria. 

The  Gaussian  beam  spot  size^^  at  the  BCGH  lenslet 
plane  was  used  as  a  limiting  design  constraint  be¬ 
cause  at  that  location  the  beam  size  is  largest.  For  a 
spot  size  greater  than  the  lenslet  (i.e.,  array  pitch), 
optical  power  would  leak  into  adjacent  elements  and 
give  rise  to  cross  talk.  A  spot  size  much  smaller 
than  the  BCGH  lenslet  would  result  in  a  diminished 
diffraction  efficiency.  Our  pitch  size  of  1  mm  was 
determined  by  the  dimensions  of  the  pixel  size  of  the 
polarization-rotator  array  used  in  our  experiments. 
A  beam  spot  size  of  0.825  mm  was  used  and  provided 
minimal  cross  talk,  high  diffraction  efficiency,  and 


high  power  throughput  (979^  of  the  beam  energy  is 
contained  in  the  1  mm  x  1  mm  square).  Accounting 
for  beam  propagation  through  multiple  optical  ele¬ 
ments  (BCGH,  polarization  rotator,  etc.)  jdelded  a 
lenslet  focal  len^h  of  85.1  mm  with  a  SOO-jxm  waist 
size  at  one  mirror  plane  and  a  lOO-ixm  waist  size  at 
the  other.  The  cavity  length  is  407  mm,  with  the 
BCGH  element  placed  107  mm  from  the  back  side 
mirror.  The  1-mm  pitch  of  the  optical  elements  also 
dictates  that  the  input  light  beam  have  an  incidence 
angle  of  0.4°.  The  polaidzation-rotator  array  is 
placed  adjacent  to  the  BCGH  array  to  best  match  the 
1-mm  pitch  of  the  ferroelectric  liquid-ciystal  (FLC) 
elements. 

Each  of  the  designed  BCGH  lenslets  functions  as 
two  independent  off-axis  Fresnel  lenses  for  the  two 
orthogonal  polarization  states  (Fig.  7),  whose  offset  is 
dictated  by  the  deflection  angle  required  by  the  in¬ 
terconnection  pattern.  The  largest  deflection  angle 
for  our  8x8  banyan  network  is  0.8°,  corresponding  to 
shifting  the  beam  by  3  pixels.  The  BCGH  was  de¬ 
signed  by  use  of  the  multiple-order  delay  approach^° 
and  fabricated  in  an  YVO4  crystal  selected  for  its  high 
value  of  birefringence.  The  advantages  of  using  off- 
axis  lenses  include  the  following  results:  (i)  The  un¬ 
wanted  zero-order  diffraction  term  can  be  filtered  out 
of  the  cavity.  Because  the  diffraction  into  the  zero 
order  is  more  sensitive  to  fabrication  errors,  which 
results  in  a  strong  unwanted  residual  component,  we 
find  that  maximum  extinction  ratios  can  be  attained 
by  use  of  the  first-order  diffraction  terms,  (ii)  The 
unwanted  higher-order  diffraction  light  is  dispersed 
over  a  large  area  of  the  micromirror  plane  and  is  not 
focused  onto  the  micromirrors  [Fig.  8(a)].  The 
amount  of  optical  power  incident  upon  adjacent  mi¬ 
cromirrors  (i.e.,  noise)  and  reflected  back  into  the 
S3^stem  is  determined  by  the  ratio  of  the  area  of  the 
micromirror  to  the  area  of  the  diffracted  order  at  the 
plane  of  the  mirrors  [Fig.  8(b)]. 

The  measured  first-order  diffraction  efficiencies  of 


Fig.  8.  CCD  pictures  of  (a)  ihf  lailpul  P.(’(1H  UmisIcI  i-K'nu’nl  for  one  polarization  stale,  showing  the  focused  iirst-order  light  and 

the  unfocused  higher  dilfraclion  oi'di'is.  and  d)i  microinirrors  11  he  dark  cii’clesi,  reflecting  imly  (irsl-diffracli<Hi-c)idei  light,  permitting 
higher-diffraction-order  light  to  exit  thi-  system. 
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the  binary  phase-level  BCGH  elements  are  33%  for 
vertical  polarization  and  35%  for  horizontal  polariza¬ 
tion,  with  extinction  ratios  better  than  60:1,  We  also 
fabricated  and  tested  four-phase-level  BCGH  diffrac¬ 
tive  elements,  which  require  greater  (emulative  etch- 
depth  errors  because  they  have  a  more  complicated 
fabrication  process.  These  elements  yielded  30:1  ex¬ 
tinction  ratios  and  diffraction  efficiencies  of  43%  for 
vertical  polarization  and  46%  for  horizontal  pol^za- 
tion.  The  relatively  low  extinction  ratio  and  di&ac- 
tion  efficiency  (compared  with  a  theoretical  efficiency 
of  80.5%  for  a  four-phase-level  diffractive  optical  ele¬ 
ment)  is  due  primarily  to  etch-depth  inconsistencies  in 
the  multistep  fabrication  process,  which  are  described 
by  the  error  term  e  [see  Eq.  (1)]. 

The  patterned  arrays  of  micromirrors  were  etched 
onto  a  mirror  fabricated  by  the  evaporation  of  alumi¬ 
num  film  onto  optical  flats,  for  which  the  average 
measured  reflectance  of  the  mirrors  was  92%.  The 
circular  micromirrors  have  diameters  of  400  and  150 
fjim,  slightly  larger  than  the  calculated  beam  diame¬ 
ter  at  the  mirror  planes. 

5.  Multistage  Interconnection  Network  System 
Experimental  Characterization 

Experimental  testing  of  our  8  x  8  MIN  system  was 
performed  with  a  488-nm  cw  Gaussian  beam  gener¬ 
ated  by  an  argon  laser.  For  initial  testing  we  used 
two  optical  input  channels,  one  with  a  dc  signal  and 
the  other  modulated  by  a  NEOS  Model  N71003 
acousto-optic  (AO)  cell.  The  polarization  state  of  the 
beam  as  it  propagates  through  the  network  is  con¬ 
trolled  by  the  two-dimensional  array  of  FLC  polar¬ 
ization  rotators  (DisplayTech,  Model  10  X  lOB). 
Reconfiguration  of  the  FLC  elements  is  under  com¬ 
puter  control,  with  a  maximum  switching  speed  (i.e., 
frame  rate)  of  0.2  ms.  The  output  signals  are  mea¬ 
sured  by  high-speed  silicon  p-i-n  detectors. 

For  an  8  X  8  folded  MIN  the  beam  makes  three 
round  trips  in  the  cavity  (i.e.,  three  layers  of  BES’s). 
By  diverting  the  beam  after  one  or  two  passes,  we  are 
able  to  use  the  same  experimental  system  to  test  the 
performance  of  a  2  x  2  (single  DBS  switch)  or  a4  X 
4  network.  Using  the  binary  phase-level  BCGH  el¬ 
ements,  we  measured  the  performance  of  a  single 
DBS  switch,  which  yielded  extinction  ratios  of 
greater  than  250:1.  The  extinction  ratio  for  the  DBS 
is  significantly  better  (4:1)  than  those  of  the  individ¬ 
ual  BCGH  elements,  which  shows  how  the  DBS  elim¬ 
inates  significant  cross  talk.  However,  because  of 
the  low  diffraction  efficiency  of  the  binary  phase  ele¬ 
ments,  the  DBS  switch  has  an  insertion  loss  of  ap¬ 
proximately  “11  dB  and  was  not  suitable  for  the 
multistage  system  experiments. 

Figure  9  shows  the  output  from  a  single  DBS,  con¬ 
structed  with  the  higher-efficiency  four-phase-level 
BCGH,  as  it  reconfigures  between  the  bypass  and  the 
exchange  modes  at  a  2-kHz  rate  (i.e.,  500- |xs  packets). 
The  AO  signal  is  modulating  one  of  the  input  sisals 
with  a  square  wave  at  40  kHz  (we  used  this  relatively 
slow  input  signal  to  permit  simultaneous  oscilloscope 
visualization  of  both  AO  and  FLC  reconfiguration 


Fig.  9.  Output  signals  for  a  single  folded  DBS  (2x2  switch)  with 
two  input  signals:  a  dc  signal  and  a  20-kHz  signal.  The  switch 
is  reconfiguring  at  a  1-kHz  rate,  limited  by  the  lOO-jis  character¬ 
istic  rise  time  of  the  FLC.  The  measured  average  SNR  is  57:1. 


frequencies).  The  top  and  the  bottom  traces  in  Fig. 

9  show  the  outputs  of  channel  1  and  channel  2,  re¬ 
spectively.  Both  confi^rations  (i.e.,  bjqiass  and  ex¬ 
change)  produce  extinction  ratios  (defined  as  the  ratio 
between  the  on  state  and  the  off  state  when  one 
input  signal  is  present)  greater  than  59:1  and  a  SNR 
(defined  as  the  ratio  of  the  signal  to  the  noise  at  the 
same  output,  i.e.,  cross  talk  between  two  input  sig¬ 
nals)  of  greater  than  57:1.  Results  of  using^ signals 
ranging  from  dc  to  10  MHz  show  similar  SNR  s,  high¬ 
lighting  the  optical  transparency  of  the  system.  The 
extinction  ratio  improvement  for  the  DBS  is  only  2:1 
better  than  the  extinction  ratio  of  the  diffraction  or¬ 
ders  of  the  four-phase-level  elements  used.  This  re¬ 
sult  is  attributed  to  the  much  stronger  cross  talk  that 
is  due  to  fabrication  errors  (e)  seen  in  these  elements. 
The  higher  diffraction  efficiency  of  the  four-phase- 
level  BCGH  reduced  the  insertion  loss  of  the  DBS  to 
approximately  —  9  dB. 

By  allowing  the  beams  to  propagate  two  round 
trips  through  the  cavity  (by  use  of  the  four-phase- 
level  BCGH  elements),  we  experimentally  character¬ 
ized  a  multistage  4X4  system.  Using  a  single  dc 
input  signal  that  switched  among  four  output  chan¬ 
nels,  we  measured  an  average  SNR  of  90:1  and  an 
extinction  ratio  of  120:1  (a  4:1  improvement  over  the 
individual  BCGH  elements).  To  investigate  cross 
talk  further,  we  introduced  a  second  input  signal. 
The  measured  output  amplitudes  are  shown  in  Fig. 
10.  Most  notable  is  that  the  output  intensities  vary 
depending  on  the  output  as  well  as  the  input  channel, 
a  result  that  might  occur  because  of  the  variation  of 
the  lenslet  diffraction  efficiencies  for  different  polar¬ 
ization  states.  The  minimal  average  (i.e.,  the  weak¬ 
est  output  signal  to  the  strongest  output  noise)  SNR 
is  87:1. 

The  complete  8x8  interconnection  system  was 
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Fig.  10.  Output  from  a  4  x  4  switch  with  two  input  signals  (A  and 
B)  routed  to  the  four  output  channels  by  a  host  computer  control¬ 
ler.  The  average  SNR  is  120:1,  which  is  twice  as  gi'eat  as  that  of 
the  single  DBS  performance,  highlighting  the  filtering  capabilities 
of  the  folded  MIN  configuration. 


characterized  with  a  single  dc  input  signal  that 
switched  among  all  eight  output  channels.  The  out¬ 
put  signals  were  relatively  weak  and  were  therefore 
imaged  onto  a  CCD  camera  (which  integi*ates  over 
time)  for  detection  (see  Fig.  11).  The  average  mea¬ 
sured  SNR  was  better  than  30:1.  This  relatively  low 
SNR  can  be  attributed  to  the  strong  backgi'ound  noise 
and  the  small  dynamic  range  of  the  CCD  device  used. 
We  performed  similar  measurements  by  using  two 
input  signals  that  switched  among  all  eight  output 
channels  that  gave  similar  SNR  results. 


Fig.  11.  CCD  time-sequenced  imnge.s  .^bowing  a  single  input  dc 
signal  routing  among  eight  irntput  channels.  The  mea.su red  av¬ 
erage  SNR  of  30:1  is  derived  IVom  the  CCD  pixel  values.  This 
relatively  low  value  might  be  due  to  the  poor  dynamic  I'ange  and 
the  background  noise  of  the  CCD  device. 


6.  Discussion 

We  have  described  the  design,  fabrication,  and  test¬ 
ing  of  an  optical  MIN  by  using  a  novel  folded  archi¬ 
tecture  and  compact  polarization-selective  BCGH 
elements.  The  design  process  determines  the  opti¬ 
mal  system  dimensions,  which  are  constrained  pri¬ 
marily  by  the  limits  of  the  diffractive  optical 
fabrication  facilities  available.  We  have  demon¬ 
strated  how  the  folded  design  allows  for  the  elimina¬ 
tion  of  the  first-order  cross  talk,  ease  of  MIN  system 
alignment  and  packageability,  as  well  as  filtering  of 
unwanted  high  diffraction  orders.  The  use  of  space- 
variant  lenslets  also  allows  for  the  implementation  of 
arbitrary  MIN  architectures.  The  folded  DBS  (i.e., 
2x2  switches)  improved  the  extinction  ratio  com¬ 
pared  with  those  of  the  single  BCGH  diffractive  ele¬ 
ments,  binary  and  four-phase  level,  used.  Further 
improvements  in  filtering  out  cross  talk  were  seen  in 
the  4X4  interconnection  system,  with  measured 
extinction  ratios  of  120:1  (i.e.,  when  input  signals 
pass  through  two  layers  of  DBS  switches). 

Traditional  nonfolded  MIN  systems  are  planar  by 
design  and  occupy  an  area  proportional  to  their  num¬ 
ber  of  stages.  However,  such  is  not  the  case  for  our 
folded  MIN,  in  which  the  stages  are  stacked  verti¬ 
cally.  The  footprint  that  the  system  occupies  de¬ 
pends  on  the  length  of  the  mirror  cavity,  which  is 
dictated  by  the  focal  length  and  the  deflection  angle  of 
the  BCGH  Fresnel  lenslets.  For  off-axis  lenslets  the 
maximum  deflection  angle  will  be  constrained  by  the 
minimum  feature  size  of  the  fabrication  process. 
However,  for  feature  sizes  smaller  than  five  wave¬ 
lengths,^®  the  diffraction  efficiency  can  be  adversely 
affected. 

When  the  N  input  channels  are  arranged  in  a  1  X 
N  vector  form,  the  number  of  elements  in  the  polar¬ 
ization  rotator  and  BCGH  arrays  scales  as  N  in  the 
horizontal  dimension  and  as  logCAT)  in  the  vertical. 
As  N  increases,  the  ratio  of  the  width  to  the  height  of 
the  arrays  is  increased.  For  large  N  a  relatively 
wide  system  can  result,  which  will  require  relatively 
large  deflection  angles.  An  alternative  strategy  for 
maintaining  system  compactness  is  to  arrange  the  N 
input  channels  in  a  rectangular  array  form  (i.e.,  M 
rows  of  length  N/M).  This  procedure  would  redis¬ 
tribute  pixels  from  the  same  stage  into  multiple  rows 
and  would  result  in  a  more  symmetric  system  that 
could  significantly  reduce  the  maximum  degree  of  the 
deflection  angles  required. 

The  optical  transparency  of  the  MIN  allows  for 
transmission  of  very  high  data-rate  signals.  We 
tested  our  folded  MIN  with  signals  from  dc  to  10  MHz 
(the  limit  of  our  AO  cell  modulation  speed)  and  found 
no  change  in  the  SNR  or  the  extinction  ratio.  We 
expect  system  performance  to  be  constant  for  signal 
band  widths  into  the  multigigahertz  range.  The  lim¬ 
iting  factor  in  interconnection  reconfiguration  is  the 
rise  and  fall  times  of  the  employed  FLC  polarization- 
rotation  elements.  State-of-the-art  FLC  response 
times  have  approximately  10-|jls  rise  times,  but  other 
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electro-optic  materials  could  provide  devices  with 
orders-of-magnitude  faster  response  times.^’ 

The  efficiency  of  the  BCGH  elements  is  the  limiting 
factor  in  increasing  the  SNR  of  the  multistage  sys¬ 
tem.  In  our  case  the  poor  performance  of  the 
Fresnel  lenslets  can  be  attributed  to  two  factors;  (i) 
inaccurate  etching  depths  owing  to  ion-etching  de%dce 
inconsistencies  and  (ii)  use  of  the  multiple-order  de¬ 
lay  approach  to  fabrication  of  BCGH  elements,  which 
has  increased  sensitivity  to  etch-depth  errors.  Sig¬ 
nificantly  higher  diffraction  efficiencies  can  be  ex¬ 
pected  with  improved  etching  facilities  and  the  use  of 
other  BCGH  design  approaches,  such  as  dual- 
substrate^®  and  form  birefringent  elements, 

The  use  of  high-efficiency  diffractive  elements 
would  also  greatly  reduce  the  insertion  losses  of  these 
systems.  For  example,  using  32-phase-level  BCGH 
Fresnel  lenslets  with  97%  diffraction  efficiency®®  and 
dielectric  mirrors  with  99%  reflectance  as  well  as 
antireflectance  coating  of  all  optical  surfaces  would 
result  in  insertion  losses  less  than  —  1  dB  for  each 
stage.  For  an  interconnection  system  with  10 
stages,  which  allows  for  1024  input  channels,  the 
total  insertion  loss  would  be  approximately  -7  dB. 
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